cv

About me

Full Name Jennifer (Jenna) Jordan
Job Title Data Engineer
Citizenship USA

Skills

  • Languages: python, SQL, bash, YAML/YAQL, jinja, regex, Xpath, R (tidyverse), SPARQL
  • Featured Libraries: dbt, pandas, plotly, streamlit, SQLAlchemy, Great Expectations
  • Data Stores: PostgreSQL (and PostGIS), BigQuery, SQLite, duckdb, JSON, XML, RDF (knowledge graphs)
  • Other Tools: git, GitHub, dbt Cloud, Civis Platform, Jira, OpenRefine, ArcGIS Pro
  • Concepts: data modeling (3NF, star schema), relational and analytical database design, data orchestration (ETL pipelines), data management and governance, FAIR data principles
  • As well as: data wrangling, visualization, and exploratory data analysis; teaching workshops; writing documentation; presenting projects

Experience

  • Aug 2024 - present
    Senior Consultant, Data Management & Strategy
    Analytics8
    • Working with a mission-driven healthcare organization to advise on dbt Mesh best practices, strategic processes to support data governance and its implementation within dbt, and provide support & training for the transition to a dbt Mesh of multiple projects.
  • Mar 2024 - Aug 2024
    Staff Consultant, Data Management
    Analytics8
    • Working with a mission-driven healthcare organization as they migrate their legacy monolith dbt project to a dbt mesh architecture of multiple domain-driven projects
  • Jan 2022 - Mar 2024
    Data Engineer
    Analytics Team, Department of Innovation & Technology, City of Boston
    • Initiated & led the engineering team’s migration of ~200 legacy ETL pipelines to dbt, with redesigned orchestration workflows & data warehouse schemas
      • This resulted in the team’s first data catalog, jumpstarted data governance efforts, provided oversight over and analysis of both pipelines and data, and improved the use of both computational resources and engineers’ development time
      • Presented and advocated for this work to the team, the CIO, the department, and at a conference
    • Built out custom ETL pipelines to integrate data trapped across many different sources, enabling key metrics to be tracked (e.g. time for affordable housing projects to be approved) and complicated manual processes to be automated (e.g. joining together permitting, housing, and development data)
      • Advised on structural & process changes to involved source systems across different departments
    • Wrote custom component scripts in python to ingest and export data
    • Ensure the analysts have high quality, reliable data for analytics end products (e.g. dashboards) and the public has up to date access to open data published on Analyze Boston.
  • Jan 2021 - Nov 2021
    Data Engineer, Computational Methods Instructor (Contractor)
    Network Contagion Research Institute (NCRI)
    • Built ingest pipelines (with Python) and designed relational databases for the data to add new social media communities (Telegram, 4-chan) to Pushshift in order to support NCRI's mission of identifying disinformation and extremism on social media.
    • Ran the Computational Methods group for interns at NCRI Labs (Summer and Fall 2021 semesters)
      • Designed a semester-long curriculum for computational skills.
      • Taught weekly workshops on Bash, Git, Python, SQL, and regex (using the Software Carpentries lesson plans).
      • Supervised independent projects.
  • Sep 2020 - Apr 2021
    Marketing Data Analyst (Contractor)
    Bright Wolf
    • Built a streamlit web app to visually analyze Bright Wolf’s marketing data from Salesforce, email campaigns, and company data enrichment, enabling the sales team to see overarching trends
      • Built the ETL pipelines to ingest data from these sources via APIs into a local database
  • Aug 2019 - May 2020
    Graduate Research Assistant
    Cline Center for Advanced Social Research, University of Illinois at Urbana-Champaign
    • Supported the Cline Center with data releases by preparing/cleaning data, updating documentation, and coordinating with the data repository.
    • Promoted the Global News Index by preparing and helping to present tutorials.
    • Developed a user feedback process for Cline Center software applications and data releases.
    • Other support tasks as needed.
  • May 2019 - Jul 2019
    Data Science Intern
    The Program on Governance and Local Development (GLD), University of Gothenburg
    • Assisted the Data Scientist on data monitoring tasks for an ongoing survey (LGPI 2019) being conducted in Zambia and Kenya - for example, one data monitoring task was to make sure that enumerators completed the requisite number of surveys in each designated hectare, according to the sampling plan.
    • Wrote Python scripts to organize and wrangle data downloaded from SurveyToGo, edited surveys in SurveyToGo, worked with geospatial data using Python and QGIS, and helped to develop the Local Governance Performance Index based on data collected from the survey.
    • Worked extensively with large XML documents, and wrote custom scripts to extract data from KML files and reformat into WKT files.
    • Attended the 2019 Annual GLD Conference, which focused on the theme "Routes to Accountability."
  • Jan 2019 - May 2019
    Graduate Research Assistant
    Cline Center for Advanced Social Research, University of Illinois at Urbana-Champaign
    • Created user-friendly documentation for the Cline Center’s Global News Index using Adobe InDesign.
    • Wrote an in-depth codebook on the GNI’s variables and corpora, a detailed user guide and a quick-start guide on Archer (the in-house software developed to query the GNI) and a guide on using Solr to query the GNI.
    • Helped to test, document bugs, and suggest features for Archer (software application recently developed by the Cline Center), and assisted users in querying the GNI.
  • Sep 2018 - Dec 2018
    Research Assistant (graduate hourly position)
    iSchool, University of Illinois at Urbana-Champaign
    • Investigated and compiled a corpus on Data Management Plans (DMPs) for Prof. Peter Darch.
    • Annotated datasets for use in human-in-the-loop machine learning analyses performed by Prof. Jana Diesner’s research group.
  • Jun 2016 - Aug 2017
    English Teacher
    Corem Language Institue, South Korea
    • Taught English as a Foreign Language to Kindergarten and Elementary students in Yangsan, South Korea, at a private academy (hagwon). I taught an average of 8 40-minute classes each day.
    • My kindergarten students were 5-6 years old. In addition to teaching basic reading, writing, and conversational skills I taught fun math, science, and arts & crafts lessons. I also wrote bimonthly progress reports for each student, administered occasional tests, and helped to conduct monthly “phone interviews” with the students in the upper-level classes.
    • My elementary students were 7-12 years old. In addition to teaching language lessons, I was in charge of making and grading their tests and writing bimonthly progress reports for each student.
  • Jan 2014 - Apr 2014
    Intern
    U.S. Mission to the UN at Washington, DC (USUNW), U.S. Department of State
    • Worked as part of a small team advising the U.S. Ambassador to the United Nations Samantha Power, serving as a bridge between the United Nations, the Executive Office, and the National Security Council.
    • Supported the Senior Policy Advisors, Speechwriter, and Executive Assistant with research, copywriting, and administrative tasks.
    • Wrote executive summaries and took notes on office, cross-bureau, and inter-departmental meetings for the USUNW team.

Publications & Presentations

  • presented Oct 2023
    From coast to coast: implementing dbt in the public sector
    Coalesce 2023 (by dbt Labs), San Diego
    • Presented about Boston's implementation of dbt in order to improve data services and data engineering practices, while my cospeakers presented about their projects for the State of California and Cal-ITP.
    • This session discusses the similarities and differences between the implementations of dbt, and how some of the constraints and challenges of working in government shape both the technical and social design of data services. The speakers reflect on successes, challenges, and lessons learned about adopting modern data tooling in state and local governments.
  • published Aug 2021
    Interactive Data Visualizations in Python
    The Carpentries, Lesson Incubator
    • Developed a Carpentries workshop lesson designed to be an introduction to making interactive visualizations in python.
    • Learners create a new environment using conda, wrangle data into the proper format using pandas library, create visualizations using the Plotly Python library, and display these visualizations and create widgets using Streamlit.
  • published Jan 2021
    No buzz for bees: Media coverage of pollinator decline
    Proceedings of the National Academy of Sciences (PNAS)
    • Co-author on Perspective paper submitted to PNAS in March 2020, published Jan 2021.
    • Wrote the ETL scripts to collect data from the Global News Index and transform the data for analysis & visualization, created an interactive visualization tool to aid in exploratory data analysis, and wrote the documentation for the data & code deposited in the Illinois Data Bank.
  • published May 2020
    Python can be tidy too: pandas recipes for normalizing data
    PyCon 2020 (virtual)
    • Published poster online for PyCon 2020 (conference moved online-only due to COVID-19)
    • Demonstrates a collection of recipes from my Tidy Pandas Cookbook that draw on the “tidy data” and “3rd normal form” philosophies of data organization, using data from the Correlates of War and Uppsala Conflict Data Program.
  • presented Oct 2019
    Put Relational Databases in Your Data Curation Toolbox
    Proceedings of the Association for Information Science and Technology (ASIS&T 82nd Annual Meeting; in Melbourne, Australia)
    • Presented my poster advocating the use of relational databases in the data curation process, especially for datasets that are published separately but can be used together due to a common identifier scheme and shared attributes.
    • The Correlates of War datasets are used as an illustrative example to show how the normalization process results in a design with greater data reusability, while check constraints and foreign key constraints can improve data quality.

Education

  • 2020
    Master of Science, Library and Information Science
    University of Illinois at Urbana-Champaign
    • 3.97 GPA
  • 2015
    Bachelor of Arts, Journalism, Political Science
    University of North Carolina at Chapel Hill
    • Graduated with Honors
    • 3.61 GPA

Certificates

  • 2023
    dbt Developer
    dbt Labs
  • 2022
    Analytics Engineering with dbt
    CoRise
  • 2020
    Software Carpentries Certified Instructor
    The Carpentries
  • 2016
    CELTA, Teaching English as a Foreign Language
    International House Budapest