Download Institute for Medical Research

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Data model wikipedia , lookup

Data analysis wikipedia , lookup

Clusterpoint wikipedia , lookup

Data vault modeling wikipedia , lookup

3D optical data storage wikipedia , lookup

Business intelligence wikipedia , lookup

Database model wikipedia , lookup

Information privacy law wikipedia , lookup

Open data in the United Kingdom wikipedia , lookup

Transcript
Institute for Medical Research
Data Scientist
Work as part of a fast-paced, highly-productive research team led by Dr. Stephen Freedland at
the Durham VA Medical Center. Our team specializes in urology research, with a focus on risk
stratification, health disparities, and lifestyle factors among veterans with prostate cancer. The
Data Scientist will leverage a large amount of national-level VA data using machine learning and
natural language processing techniques to build and train natural language understanding and
prediction systems.
Critical Element #1 – Analyze Data Using Machine Learning and Natural Language Processing
Techniques
1. Demonstrate knowledge of ML/NLP skills including entity recognition, extraction, and
linkage; domain classification and text categorization; event detection; text preprocessing; and
language modeling
2. Use NLP techniques to extract meaning from text found in electronic medical records
3. Develop machine learning algorithms and models to acquire domain understanding of factors
contributing to and predicting urologic disease
4. Conduct experiments to prove the efficacy of the new algorithms and models that you develop
5. Manage processes and jobs using a “Big Data” framework such as Hadoop
6. Keep up with recent advances in natural language processing, machine learning, and big data
processing
7. Ensure data security including privacy protection, backups, etc.
Critical Element #2 – Acquire, Store, and Process Data
1. Access VA databases, repositories, and charts to extract relevant data and transfer to relational
databases
• Utilize VA documentation and resources to identify data of interest
• Evaluate data efficacy and think critically about the data
2. Integrate ML/NLP pipelines with existing database systems for storage and manual review of
data
3. On occasion, contribute to the development of database back ends for data storage and
frontends for data review
Critical Element #3 – Document Work
1. Document work via procedural manuals, comments, codebooks, and good programming
practices
2. Contribute to and review articles for submission to conferences and peer-reviewed journals
3. Provide written, oral, and/or electronic status reports of assigned tasks
Critical Element #4 – Collaborate with Study Team Members
1. Coordinate with collaborators to learn about and share knowledge of VA resources and
ML/NLP techniques
2. Provide input on project proposals and decisions related to scope, timeline, feasibility, and
other relevant issues
3. Liaise with research coordinators to ensure all procedures comply with HIPAA and IRB
regulations
4. Communicate with VA technology departments to address/resolve IT needs and problems
5. Collaborate with developers and coordinators to ensure that project deadlines are met
6. Participate in local and remote team meetings on a frequent and regular schedule
7. Train additional team members, as necessary
The above statements describe the general nature and level of work being performed by
individuals assigned to this classification. This is not intended to be an exhaustive list of all
responsibilities and duties required of personnel so classified. Employees are expected to
perform other related duties incidental to the work described herein.
About Institute for Medical Research
Institute for Medical Research, Inc. (IMR) is a non-profit, tax exempt institute whose mission is
to support research and education at the Durham Veterans Affairs Medical Center for the
enhancement of the health and lives of the Veteran population, their families and the public at
large. Established in 1989, IMR conducts and supports extramural research and educational
activities by collaborating with the Durham VA Medical Center, private companies,
governmental agencies, foundations and academic institutions. It is through these partnerships
that the Institute for Medical Research, Inc. has developed an extensive research portfolio to
advance the health and well-being of veterans, their families and the public at large. With a
varied research program from cancer to PTSD to research in space the Institute for Medical
Research plays a vital role in the support of research and education at the Durham VA Medical
Center. http://imr.org/
Qualifications
1. MS/PhD in Computer Science, Engineering, Applied Mathematics, or other relevant field
2. Proficiency with commonly available tools and infrastructures for natural language
processing, text mining, machine learning, and parallel data processing
3. Strong software development skills in modern (C++, Java, etc.) and scripting languages
(Python, Perl, etc.)
4. Strong computer science fundamentals (algorithms, data structures, software architecture, and
object-oriented design)
5. Experience using “Big Data” platforms (Hadoop/Mahout, Spark/MLlib, etc.)
6. Knowledge of database theory, design, and querying
7. Experience with Microsoft SQL Server, Microsoft Access, Microsoft Excel, and SQL is
preferred
8. Understanding of software version control concepts and tools
9. Ability to solve problems, troubleshoot, and quickly and independently learn new skills
10. Record of contributions to research products is preferred, preferably peer-reviewed
publications
11. Well-developed oral and written communication skills, including the ability to convey
complex technical information accurately to both technical and non-technical audiences
12. Demonstrated organizational skills with ability to manage multiple work assignments and
adjust schedules based on changing requirements, priorities, and deadlines.
13. Ability to work effectively both independently and as part of a team; exhibiting a spirit of
cooperation, teamwork, resource sharing, and mutual respect
Employees work for the Institute for Medical Research and will have a WOC appointment at the
Durham VA Medical Center. Please visit www.imr.org for information about employee benefits.
Salary will be commensurate with abilities and experience. We do not sponsor applicants for
work visas. We are an equal-opportunity employer.
Please forward cover letter and resume to [email protected] citing “Data
Scientist Job Application” in the subject line