I am a data scientist. The first thing you do when asked to use or access a source that is unfamiliar to you is consult with people who know it, and read the system documentation. But know to ask questions.
Comments
Log in with your Bluesky account to leave a comment
If a data element appears to contain data that should be impossible or makes no logical sense, it almost always means you’re missing a key piece of information about the data definition and/or the QA/QC and how the data are used.
The sequence is:
Read the documentation
Ask questions
Look at the actual data
Run queries to filter for the information you need
Then ask questions about things that maybe don’t look right.
Comments
Read the documentation
Ask questions
Look at the actual data
Run queries to filter for the information you need
Then ask questions about things that maybe don’t look right.
Jump right in and create an analysis. You WILL almost certainly have screwed it up.