An official website of the United States government
A .mil website belongs to an official U.S. Department of Defense organization in the United States.
A lock (lock ) or https:// means you’ve safely connected to the .mil website. Share sensitive information only on official, secure websites.

Data Literacy

About CNA and its Data Science Division

About CNA

CNA is a nonprofit research and analysis organization dedicated to the safety and security of the nation. It operates the Center for Naval Analyses, the only federally funded research and development center (FFRDC) serving the Department of the Navy, as well as the Institute for Public Research.

CNA develops actionable solutions to complex problems of national importance that support naval operations, fleet readiness, and great power competition. Its non-defense research portfolio spans criminal justice, homeland security, and data management.

CNA’s Data Science Division

CNA’s Data Science Division provides data science expertise and support to the Department of the Navy. The program develops advanced analytics and algorithms, including:

  • Machine learning
  • Predictive analytics

These tools identify critical drivers of performance to improve Navy and Marine Corps outcomes. The program also facilitates decision-making through:

  • Design-thinking workshops
  • Solution-focused methodologies
  • Quick-turn efforts

Data Literacy Course Overview

This course, designed by data scientists at the Center for Naval Analyses, is quick to complete and easily digestible. It highlights common challenges in data-centric organizations, using examples drawn from the Performance-to-Plan efforts in the Naval Aviation Enterprise.

It blends real-world scenarios with notional examples, aiming to enable you to become a data-literate leader in a data-driven Navy.

Data Analysis as an OODA Loop

Data-driven decision-making mirrors Boyd’s OODA Loop (Observe-Orient-Decide-Act). The process includes four essential steps:

  1. Collect all necessary data
  2. Process data to ensure usability
  3. Analyze data rigorously
  4. Act decisively based on analysis

The goal? Reduce the time required to make informed, objective decisions.

 

Ready to Elevate Your Data Literacy?

Explore the course here!



📥 Download the full course here! (*CAC-enabled, requires permission)

Knowing Data

Data Quantity

You can be confident in the results of an analysis only if there is sufficient data; for example, could you determine whether a baseball player is a good hitter after watching only a single at-bat? What if you knew whether the player swung at each pitch, but had no information on the result?

  • Are there enough observations?
  • Were any data removed? If so, why?
  • Do the data contain the right fields?

Data Quality

There is an old mantra in data analysis: garbage in, garbage out. Even the best analytic tools cannot produce good results from bad data. Having a significant quantity of data is not very meaningful if the data are not accurate, complete, and consistent.

Could you determine whether a baseball player is a good hitter if the data on half of their at-bats were corrupted? What if the capital letters “O” and “I” were transcribed as the numerals “0” and “1”, respectively? A computer views “OUT” and “0UT” as being distinct strings!

  • What are the possible sources of error?
  • What is the error rate?
  • Are the data authoritative, i.e. are they used as the basis for other analyses and decision-making?

Data Sanity

  • Are the data appropriate for a particular analysis?
  • Are there authoritative rules to identify “good” and “bad” data?
  • Are the hardware and software systems that acquire, process, and store the data well-understood

Data Visualization

Visualize Your Data

Graphical plots and figures can be extremely powerful tools for understanding data and representing analysis results. The famous Anscombe’s Quartet, shown below, is a classic example of the value of data visualization. These four data sets have identical summary statistics—the mean values, variances, and best-fit regression lines are all the same—but plotting the underlying data shows them to have very different distributions.




However, data visualization tools must be used with caution. Inappropriate use of colors, scales, labels, and data selection can easily lead to misinterpretation. The figure below has expenses and revenues plotted on two separate vertical scales in a sneaky attempt to show that expenses (red line) are increasing at a faster rate than revenues (blue line). 




Representing the two values by their relative changes over the indicated time period shows that revenue has increased nearly 2.5 times as quickly as expenses.

Using Data

Upholding Data Analysis Integrity

If you torture the data enough, it will confess.

A proper analysis must be objective, rigorous, transparent, and reproducible. Analysis plans should be detailed in advance; analysts are prone to adjusting their analyses to fit a specific conclusion, sometimes inadvertently. These plans should detail all steps taken at the collection, processing, and analysis stages. In cases of prescriptive analyses, where analytic results direct specific actions, the analyst should also have a plan for quantitatively measuring the impact of the action.



 

Key Components of a Comprehensive Analysis Plan


Analysis plans should include details about:

  • The data sources used in the analysis
  • Processing steps on the raw data, for example
    • Selecting – which observations and features are included in the analysis?
    • Formatting – are the data modified in any way?
    • Filtering – are specific observations excluded due to values in their features?
  • The category of analysis (descriptive, predictive, prescriptive)
  • The type of analysis (regression, categorization, clustering, optimization, etc.)
  • Assessing the analysis performance

Start Becoming More Data Literate

Enhancing Data Literacy: A Call to Action

  • Lead by Example: Consistently question the data and analyses presented to you. When in doubt, ask; when certain, ask anyway.

  • Foster a Culture of Inquiry: Encourage everyone to challenge the data they use and the analyses they create. Regularly conduct red-team exercises to identify errors and inconsistencies.

  • Promote Data-Driven Decision-Making: Discourage decisions based solely on instinct or tradition. The mindset of "because we've always done it this way" can hinder progress.

  • Embrace Outcome Analysis: Strive to understand the results of actions taken. Reward acknowledgment and understanding of failures, and avoid complacency in success by seeking to understand the underlying reasons.

By adopting these practices, you can cultivate a data-literate environment that values critical thinking and continuous improvement.

Guidance-Card-Icon Dept-Exclusive-Card-Icon