NREL’s Grid Planning and Analysis team is seeking a curious, sharp, and technically skilled Data Engineering Intern to help build the largest grid datasets in the world. We are looking for a candidate passionate about building trusted, validated databases from the ground up using messy, heterogeneous datasets that fuel real-world energy research and national policy decisions.
You'll be instrumental in transforming unstructured or semi-structured data into high-integrity, queryable databases that support machine learning models, visual analytics, and advanced simulations.
Responsibilities
- Ingest, clean, and validate diverse datasets from internal and external sources (e.g., CSVs, spreadsheets, Text Files and SQL dumps)
- Design and assemble structured, high-quality relational databases
- Collaborate with domain scientists and software engineers to understand data contexts and ensure correct integration into the Sienna Platform
- Document data provenance, validation logic, and database design decisions
- Write reusable scripts for data transformation and validation (preferably in Python or similar languages)
- Contribute to internal Git repositories and maintain clean, reproducible codebases
.Basic Qualifications
Must be enrolled as a full-time student in a Bachelor's, Master's or PhD degree program, or graduated in the past 12 months from an accredited institution. Candidates who have earned a degree may work for a period not to exceed 12 months. Must have a minimum of a 3.0 cumulative grade point average.
Please Note:
- You will need to upload official or unofficial school transcripts as part of the application process.
- If selected for position, a letter of recommendation will be required as part of the hiring process.
* Must meet educational requirements prior to employment start date.
Additional Required Qualifications
- Completed a Bachelor’s or Master’s degree in Computer Science, Applied Mathematics, Data Science, Software Engineering, or a related field, or currently enrolled in a masters or PhD program in these fields
- Must provide a GitHub profile or portfolio showcasing relevant projects as part of application, please include a link in your resume/CV
- Strong programming skills in Julia and python with experience with Pandas, NumPy, or similar libraries a plus
- Demonstrated experience working with datasets and databases
- Ability to communicate technical details clearly and collaborate in a team-oriented environment
- Data requests through APIs
Preferred Qualifications
- Proficiency in SQL and experience with relational databases (PostgreSQL, MySQL, SQLite, etc.)
- Machine learning experience