AI Prompts for Data Analysis

Top-rated AI prompts for Data Analysis. Copy any prompt and get instant results.

Your complete step-by-step AI guide for Data Analysis. Copy, paste, and get results.

AI Prompts for Data Analysis

Top-rated AI prompts for Data Analysis. Copy any prompt and get instant results.

Scroll to explore

This collection of tested AI prompts for Data Analysis covers structure the analysis, write analysis queries and code, interpret results correctly, and more. Each prompt is copy-paste ready and free to use. Copy any prompt, add your specifics, and get professional Data Analysis results in seconds.

Stage 1

Structure the Analysis

Analysis that starts without a clear question produces answers nobody needs. These prompts help you define what you are trying to find out before writing a single query.

Define analysis question and approach

I need to analyze [DESCRIBE DATA/SITUATION] to answer this business question: [DESCRIBE QUESTION]. Help me structure the analysis: what is the precise question I am trying to answer, what data do I need, what type of analysis is appropriate (descriptive, diagnostic, predictive, prescriptive), what would a useful output look like, and what are the limitations I should be upfront about?

Structure the Analysis

Identify metrics for business question

The business question is: [DESCRIBE QUESTION]. What metrics should I analyze to answer it? For each metric, explain: what it measures, how to calculate it from my data, what a good versus bad value looks like, and what additional context is needed to interpret it correctly. Data available: [DESCRIBE DATA SOURCES]

Structure the Analysis

Design analysis for A/B test results

I ran an A/B test: [DESCRIBE TEST]. Here are the results: [PASTE RESULTS OR DESCRIBE]. Help me analyze whether the result is statistically significant, practically significant (not just statistically), and whether there are segments or confounders I should check before concluding the test was successful or failed.

Structure the Analysis

Identify data quality issues

Before I analyze this dataset, help me identify data quality issues. Here is a sample or description: [DESCRIBE DATA]. What should I check for: missing values, outliers, duplicates, inconsistent formats, implausible values? For each issue found, suggest how to handle it in the analysis.

Structure the Analysis

Choose the right visualization type

I want to visualize [DESCRIBE DATA AND RELATIONSHIP]. For each of these scenarios: [LIST SCENARIOS], tell me which chart type is most appropriate and why. What are the most common mistakes people make when visualizing this type of data, and what makes a chart misleading versus informative?

Structure the Analysis

Stage 2

Write Analysis Queries and Code

Getting the data into the right shape is often most of the work. These prompts help you write the code to do it correctly.

Write data analysis in Python/pandas

Write Python code using pandas to analyze this dataset: [DESCRIBE DATA STRUCTURE]. I want to: [DESCRIBE ANALYSIS GOALS]. Include: loading the data, cleaning obvious issues, computing the metrics I need, and producing a summary output. Add comments explaining the analytical choices.

Write Analysis Queries and Code

Write SQL for business metrics

Write SQL queries to calculate these business metrics from this database schema: [DESCRIBE SCHEMA]. Metrics needed: [LIST METRICS, e.g. DAU, retention rate, revenue per user, churn]. For each metric, write the query and explain any non-obvious calculation choices, especially around time windows and user definitions.

Write Analysis Queries and Code

Calculate cohort analysis

I want to do a cohort analysis on [DESCRIBE USER BEHAVIOR, e.g. retention, revenue, feature adoption]. My data has: [DESCRIBE TABLE STRUCTURE]. Write the SQL or Python code to: define cohorts by [DIMENSION, e.g. signup week, first purchase month], calculate [METRIC] for each cohort over time, and output a cohort grid I can visualize.

Write Analysis Queries and Code

Detect and handle outliers

My dataset contains outliers that may be affecting my analysis: [DESCRIBE DATA AND OUTLIER CONCERN]. Write code to: detect outliers using [METHOD, e.g. IQR, z-score, or suggest appropriate method], visualize their distribution, and give me the options for handling them with the pros and cons of each approach for my specific use case.

Write Analysis Queries and Code

Aggregate and reshape data

I need to transform this data from [DESCRIBE CURRENT FORMAT] to [DESCRIBE DESIRED FORMAT] for my analysis. Write the pandas or SQL code to do this transformation. Common transformations: pivot/unpivot, groupby aggregation, rolling windows, merging tables on [KEYS].

Write Analysis Queries and Code

Stage 3

Interpret Results Correctly

Drawing the wrong conclusion from correct analysis is a common and costly mistake. These prompts help you interpret findings accurately.

Interpret correlation vs causation

My analysis shows a correlation between [VARIABLE A] and [VARIABLE B]: [DESCRIBE FINDING]. Help me think through whether this could be causal or whether there are plausible confounders or alternative explanations. What additional analysis would help distinguish correlation from causation here?

Interpret Results Correctly

Check analysis for common biases

I am about to present this analysis: [DESCRIBE ANALYSIS AND FINDINGS]. Review it for common analytical biases: survivorship bias, selection bias, Simpson's paradox, p-hacking, and confirmation bias. Where might any of these be affecting my conclusions?

Interpret Results Correctly

Assess statistical significance

My analysis shows this result: [DESCRIBE FINDING WITH NUMBERS]. Is this result statistically significant? Given my sample size of [N], what is the confidence interval? What would I need for this result to be conclusive rather than suggestive? Explain in terms a non-statistician can act on.

Interpret Results Correctly

Identify what the data cannot tell you

Based on this analysis: [DESCRIBE ANALYSIS], what questions am I unable to answer with this data alone? What additional data would I need to be confident in the conclusion? What assumptions am I making that I should be explicit about when presenting this?

Interpret Results Correctly

Sanity check analysis results

My analysis produced this result: [DESCRIBE RESULT]. This seems [SURPRISING/TOO CLEAN/UNEXPECTED]. Help me sanity check it. What are the most common causes of results like this being wrong: data pipeline issues, query errors, definition mismatches, or sampling problems? Walk me through how to verify the result is accurate before presenting it.

Interpret Results Correctly

Stage 4

Communicate Findings

An analysis nobody acts on wasted everyone's time. These prompts help you translate findings into clear, decision-ready communication.

Write data analysis summary

Write a clear executive summary of this analysis: [DESCRIBE ANALYSIS AND FINDINGS]. The audience is [DESCRIBE AUDIENCE, e.g. non-technical executives, product team]. The summary should: lead with the answer, not the methodology, explain the key finding in one sentence, provide the supporting evidence, and end with a clear recommendation or next step.

Communicate Findings

Build data storytelling narrative

I want to present this data as a story rather than a collection of charts: [DESCRIBE DATA AND FINDINGS]. Help me structure the narrative: what is the central tension or question, what does the data reveal, what is surprising, and what should the audience do differently as a result? Write the narrative arc I can follow in my presentation.

Communicate Findings

Write analysis for technical and non-technical audiences

I need to present these findings to two different audiences: [DESCRIBE FINDINGS]. Write a version for technical stakeholders (including methodology and caveats) and a version for executives (focusing on implications and decisions). Each should be no longer than one page.

Communicate Findings

Create analysis dashboard brief

I am building a dashboard to track [DESCRIBE METRICS AND BUSINESS QUESTION]. Write a brief that defines: which metrics to show, how to define each one precisely, what time dimensions to use, what target or benchmark to show alongside each metric, and what drill-down views are most valuable. The audience is [DESCRIBE].

Communicate Findings

Write data-driven recommendation

Based on this analysis: [DESCRIBE FINDINGS], write a clear recommendation memo. Format: one-sentence conclusion, the three strongest data points that support it, the most important caveat or risk, and a specific recommended action with an owner and timeline. This will be shared with [DESCRIBE AUDIENCE].

Communicate Findings

Frequently asked questions

What is the difference between descriptive and diagnostic analytics?+

Descriptive analytics tells you what happened (revenue was down 15% last quarter). Diagnostic analytics tells you why it happened (revenue was down because churn increased in the SMB segment following a price change). Descriptive comes first; diagnostic requires drilling into the data to find the cause.

How do I know if my sample size is large enough?+

It depends on the effect size you are trying to detect and the confidence level you need. Use a power analysis to calculate the required sample size before you run the analysis. As a rough rule: to detect a 10% change with 95% confidence and 80% power, you typically need several hundred observations at minimum, and thousands for smaller effects.

What is Simpson's paradox?+

A phenomenon where a trend appears in several groups of data but disappears or reverses when the groups are combined. A classic example: a treatment appears better in each subgroup but worse overall because the groups have very different sizes. Always check whether aggregate results hold up when segmented.

When should I use median versus mean?+

Use median when your data has outliers or a skewed distribution. Revenue, response times, and user behavior data are almost always skewed, so median is usually more representative. Mean is appropriate when the data is normally distributed and you want to capture the total magnitude (like total revenue per user).

What tools should I learn for data analysis?+

SQL is non-negotiable for anyone working with data. Python with pandas and matplotlib covers most analysis tasks. For visualization, Tableau or Power BI if you need business dashboards; matplotlib or seaborn for technical analysis. For statistics, learn R or Python's scipy and statsmodels for more rigorous statistical tests.