Top-rated AI prompts for Data Analysis. Copy any prompt and get instant results.
Your complete step-by-step AI guide for Data Analysis. Copy, paste, and get results.
Top-rated AI prompts for Data Analysis. Copy any prompt and get instant results.
This collection of tested AI prompts for Data Analysis covers structure the analysis, write analysis queries and code, interpret results correctly, and more. Each prompt is copy-paste ready and free to use. Copy any prompt, add your specifics, and get professional Data Analysis results in seconds.
Stage 1
Analysis that starts without a clear question produces answers nobody needs. These prompts help you define what you are trying to find out before writing a single query.
Define analysis question and approach
I need to analyze [DESCRIBE DATA/SITUATION] to answer this business question: [DESCRIBE QUESTION]. Help me structure the analysis: what is the precise question I am trying to answer, what data do I need, what type of analysis is appropriate (descriptive, diagnostic, predictive, prescriptive), what would a useful output look like, and what are the limitations I should be upfront about?
Identify metrics for business question
The business question is: [DESCRIBE QUESTION]. What metrics should I analyze to answer it? For each metric, explain: what it measures, how to calculate it from my data, what a good versus bad value looks like, and what additional context is needed to interpret it correctly. Data available: [DESCRIBE DATA SOURCES]
Design analysis for A/B test results
I ran an A/B test: [DESCRIBE TEST]. Here are the results: [PASTE RESULTS OR DESCRIBE]. Help me analyze whether the result is statistically significant, practically significant (not just statistically), and whether there are segments or confounders I should check before concluding the test was successful or failed.
Identify data quality issues
Before I analyze this dataset, help me identify data quality issues. Here is a sample or description: [DESCRIBE DATA]. What should I check for: missing values, outliers, duplicates, inconsistent formats, implausible values? For each issue found, suggest how to handle it in the analysis.
Choose the right visualization type
I want to visualize [DESCRIBE DATA AND RELATIONSHIP]. For each of these scenarios: [LIST SCENARIOS], tell me which chart type is most appropriate and why. What are the most common mistakes people make when visualizing this type of data, and what makes a chart misleading versus informative?
Stage 2
Getting the data into the right shape is often most of the work. These prompts help you write the code to do it correctly.
Write data analysis in Python/pandas
Write Python code using pandas to analyze this dataset: [DESCRIBE DATA STRUCTURE]. I want to: [DESCRIBE ANALYSIS GOALS]. Include: loading the data, cleaning obvious issues, computing the metrics I need, and producing a summary output. Add comments explaining the analytical choices.
Write SQL for business metrics
Write SQL queries to calculate these business metrics from this database schema: [DESCRIBE SCHEMA]. Metrics needed: [LIST METRICS, e.g. DAU, retention rate, revenue per user, churn]. For each metric, write the query and explain any non-obvious calculation choices, especially around time windows and user definitions.
Calculate cohort analysis
I want to do a cohort analysis on [DESCRIBE USER BEHAVIOR, e.g. retention, revenue, feature adoption]. My data has: [DESCRIBE TABLE STRUCTURE]. Write the SQL or Python code to: define cohorts by [DIMENSION, e.g. signup week, first purchase month], calculate [METRIC] for each cohort over time, and output a cohort grid I can visualize.
Detect and handle outliers
My dataset contains outliers that may be affecting my analysis: [DESCRIBE DATA AND OUTLIER CONCERN]. Write code to: detect outliers using [METHOD, e.g. IQR, z-score, or suggest appropriate method], visualize their distribution, and give me the options for handling them with the pros and cons of each approach for my specific use case.
Aggregate and reshape data
I need to transform this data from [DESCRIBE CURRENT FORMAT] to [DESCRIBE DESIRED FORMAT] for my analysis. Write the pandas or SQL code to do this transformation. Common transformations: pivot/unpivot, groupby aggregation, rolling windows, merging tables on [KEYS].
Stage 3
Drawing the wrong conclusion from correct analysis is a common and costly mistake. These prompts help you interpret findings accurately.
Interpret correlation vs causation
My analysis shows a correlation between [VARIABLE A] and [VARIABLE B]: [DESCRIBE FINDING]. Help me think through whether this could be causal or whether there are plausible confounders or alternative explanations. What additional analysis would help distinguish correlation from causation here?
Check analysis for common biases
I am about to present this analysis: [DESCRIBE ANALYSIS AND FINDINGS]. Review it for common analytical biases: survivorship bias, selection bias, Simpson's paradox, p-hacking, and confirmation bias. Where might any of these be affecting my conclusions?
Assess statistical significance
My analysis shows this result: [DESCRIBE FINDING WITH NUMBERS]. Is this result statistically significant? Given my sample size of [N], what is the confidence interval? What would I need for this result to be conclusive rather than suggestive? Explain in terms a non-statistician can act on.
Identify what the data cannot tell you
Based on this analysis: [DESCRIBE ANALYSIS], what questions am I unable to answer with this data alone? What additional data would I need to be confident in the conclusion? What assumptions am I making that I should be explicit about when presenting this?
Sanity check analysis results
My analysis produced this result: [DESCRIBE RESULT]. This seems [SURPRISING/TOO CLEAN/UNEXPECTED]. Help me sanity check it. What are the most common causes of results like this being wrong: data pipeline issues, query errors, definition mismatches, or sampling problems? Walk me through how to verify the result is accurate before presenting it.
Stage 4
An analysis nobody acts on wasted everyone's time. These prompts help you translate findings into clear, decision-ready communication.
Write data analysis summary
Write a clear executive summary of this analysis: [DESCRIBE ANALYSIS AND FINDINGS]. The audience is [DESCRIBE AUDIENCE, e.g. non-technical executives, product team]. The summary should: lead with the answer, not the methodology, explain the key finding in one sentence, provide the supporting evidence, and end with a clear recommendation or next step.
Build data storytelling narrative
I want to present this data as a story rather than a collection of charts: [DESCRIBE DATA AND FINDINGS]. Help me structure the narrative: what is the central tension or question, what does the data reveal, what is surprising, and what should the audience do differently as a result? Write the narrative arc I can follow in my presentation.
Write analysis for technical and non-technical audiences
I need to present these findings to two different audiences: [DESCRIBE FINDINGS]. Write a version for technical stakeholders (including methodology and caveats) and a version for executives (focusing on implications and decisions). Each should be no longer than one page.
Create analysis dashboard brief
I am building a dashboard to track [DESCRIBE METRICS AND BUSINESS QUESTION]. Write a brief that defines: which metrics to show, how to define each one precisely, what time dimensions to use, what target or benchmark to show alongside each metric, and what drill-down views are most valuable. The audience is [DESCRIBE].
Write data-driven recommendation
Based on this analysis: [DESCRIBE FINDINGS], write a clear recommendation memo. Format: one-sentence conclusion, the three strongest data points that support it, the most important caveat or risk, and a specific recommended action with an owner and timeline. This will be shared with [DESCRIBE AUDIENCE].
Descriptive analytics tells you what happened (revenue was down 15% last quarter). Diagnostic analytics tells you why it happened (revenue was down because churn increased in the SMB segment following a price change). Descriptive comes first; diagnostic requires drilling into the data to find the cause.
It depends on the effect size you are trying to detect and the confidence level you need. Use a power analysis to calculate the required sample size before you run the analysis. As a rough rule: to detect a 10% change with 95% confidence and 80% power, you typically need several hundred observations at minimum, and thousands for smaller effects.
A phenomenon where a trend appears in several groups of data but disappears or reverses when the groups are combined. A classic example: a treatment appears better in each subgroup but worse overall because the groups have very different sizes. Always check whether aggregate results hold up when segmented.
Use median when your data has outliers or a skewed distribution. Revenue, response times, and user behavior data are almost always skewed, so median is usually more representative. Mean is appropriate when the data is normally distributed and you want to capture the total magnitude (like total revenue per user).
SQL is non-negotiable for anyone working with data. Python with pandas and matplotlib covers most analysis tasks. For visualization, Tableau or Power BI if you need business dashboards; matplotlib or seaborn for technical analysis. For statistics, learn R or Python's scipy and statsmodels for more rigorous statistical tests.
AI Prompts for SQL Queries
SQL is one of the most universally useful skills in data work, but most people only know enough to write slow queries that break on real data.
See promptsAI Prompts for Python Programming
Python is the most versatile programming language in use today, but writing good Python means knowing more than just the syntax.
See promptsAI Prompts for Automation Scripts
Creating automation scripts can be complex, especially when trying to ensure they are efficient and error-free.
See prompts