# Data Scientists

> Develop and implement a set of techniques or analytics applications to transform raw data into meaningful information using data-oriented programming languages and visualization software. Apply data mining, data modeling, natural language processing, and machine learning to extract and analyze information from large structured and unstructured datasets. Visualize, interpret, and report data findings. May create dynamic data reports.

- **SOC code:** 15-2051.00
- **Canonical URL:** https://singulariki.com/roles/role-15-2051-00
- **Also known as:** Analytics Consultant, Applied Scientist, Data Analyst, Data Analytic Scientist, Data Analytics Manager, Data Analytics Scientist, Data Analytics Specialist, Data Architect
- **Frame:** "AI exposure" means task overlap (how codifiable the work is), not jobs lost or a forecast. Every figure below is traced to a named public dataset.

## What this work is

**Core tasks** (O*NET):
- Analyze, manipulate, or process large sets of data using statistical software.
- Apply feature selection algorithms to models predicting outcomes of interest, such as sales, attrition, and healthcare use.
- Apply sampling techniques to determine groups to be surveyed or use complete enumeration methods.
- Clean and manipulate raw data using statistical software.
- Compare models using statistical performance metrics, such as loss functions or proportion of explained variance.
- Create graphs, charts, or other visualizations to convey the results of data analysis using specialized software.
- Deliver oral or written presentations of the results of mathematical modeling and data analysis to management or other end users.
- Design surveys, opinion polls, or other instruments to collect data.
- Identify business problems or management objectives that can be addressed through data analysis.
- Identify relationships and trends or any factors that could affect the results of research.
- Identify solutions to business problems, such as budgeting, staffing, and marketing decisions, using the results of data analysis.
- Propose solutions in engineering, the sciences, and other fields using mathematical theories and techniques.

## Skills, tools, capabilities

**Skills in demand:**
- TensorFlow _(Specialized Skill)_
- Microsoft PowerPoint _(Common Skill)_
- Microsoft Excel _(Common Skill)_
- Apache Spark _(Specialized Skill)_
- Apache Hadoop _(Specialized Skill)_
- Unix _(Specialized Skill)_
- Shell Script _(Specialized Skill)_
- PostgreSQL _(Specialized Skill)_
- NoSQL _(Specialized Skill)_
- MongoDB _(Specialized Skill)_
- Microsoft Access _(Specialized Skill)_
- Linux _(Specialized Skill)_

**Tools & technology:**
- Amazon Web Services AWS software _(hot technology, in demand)_
- Apache Hadoop _(hot technology, in demand)_
- Apache Spark _(hot technology, in demand)_
- C++ _(hot technology, in demand)_
- Git _(hot technology, in demand)_
- Microsoft Azure software _(hot technology, in demand)_
- Microsoft Excel _(hot technology, in demand)_
- Microsoft Power BI _(hot technology, in demand)_
- Microsoft PowerPoint _(hot technology, in demand)_
- Python _(hot technology, in demand)_
- PyTorch _(hot technology, in demand)_
- R _(hot technology, in demand)_

## AI exposure & outlook

- **AI task-overlap index:** 99th percentile (High) across all occupations — composite of current-era exposure studies (ai-exposure-index-v1).
- **LLM task exposure, γ (OpenAI / Eloundou):** 95th percentile (High) — source: eloundou_gamma.
- **AI assistant applicability (Microsoft):** 98th percentile (High) — source: microsoft_applicability.
- **Projected employment (BLS 2024–34):** 33.5% growth (Growing fast); 23.4k annual openings; 245.9k → 328.3k jobs.
- **Pay & employment (BLS OEWS, May 2024):** median $112,590; 233,440 employed.

## Sources

- **O*NET** (30.3) — U.S. Department of Labor / National Center for O*NET Development. https://www.onetcenter.org/database.html
- **BLS Occupational Employment and Wage Statistics (OEWS)** (May 2024) — U.S. Bureau of Labor Statistics. https://www.bls.gov/oes/
- **BLS Employment Projections** (2024–2034) — U.S. Bureau of Labor Statistics. https://www.bls.gov/emp/
- **Microsoft “Working with AI”** (working-with-ai) — Microsoft Research. https://www.microsoft.com/en-us/research/
- **“GPTs are GPTs” (Eloundou et al.)** (arXiv 2303.10130) — OpenAI / academic. https://arxiv.org/abs/2303.10130

---
_Generated from Singulariki's joined dataset; data snapshot 2026-06-02T21:00:32.945303+00:00. https://singulariki.com/roles/role-15-2051-00_
