Welcome! Now that you have initialized your project, the best way to work with Great Expectations is in this iterative dev loop:
great_expectations suite new
.great_expectations suite edit
An expectation is a falsifiable, verifiable statement about data.
Expectations provide a language to talk about data characteristics and data quality - humans to humans, humans to machines and machines to machines.
Expectations are both data tests and docs!
{
"expectation_type": "expect_column_values_to_not_be_null",
"kwargs": {
"column": "user_id"
}
}
A machine can test if a dataset conforms to the expectation.
{
"success": false,
"result": {
"element_count": 253405,
"unexpected_count": 7602,
"unexpected_percent": 2.999
},
"expectation_config": {
"expectation_type": "expect_column_values_to_not_be_null",
"kwargs": {
"column": "user_id"
}
}
Here's an example Validation Result (not from your data) in JSON format. This object has rich context about the test failure.
This is an example of what a single failed Expectation looks like in Data Docs. Note the failure includes unexpected values from your data. This helps you debug pipelines faster.
Nearly 50 built in expectations allow you to express how you understand your data, and you can add custom expectations if you need a new one.
This sample suite shows you a few examples of expectations.
Note this is not a production suite and was generated using only a small sample of your data.
When you are ready, press the How to Edit button to kick off the iterative dev loop.
Data Docs autogenerated using Great Expectations.