
datacompy 0.19.2 documentation - GitHub Pages
DataComPy ¶ DataComPy is a package to compare two DataFrames (or tables) such as Pandas, Spark, Polars, and even Snowflake.
datacompy package - datacompy 0.19.2 documentation
datacompy.base.get_column_tolerance(column: str, tol_dict: Dict[str, float]) → float ¶ Return the tolerance value for a given column from a dictionary of tolerances.
Pandas Usage - datacompy 0.19.2 documentation - GitHub Pages
Jan 1, 2017 · Datacompy sorts by the other fields before generating the temporary ID, then matches directly on that field. If there are a lot of duplicates you may need to join on more columns, or handle …
Fugue Detail - datacompy 0.19.2 documentation - GitHub Pages
DataComPy uses Fugue to partition the two dataframes into chunks, and then compare each chunk in parallel using the Pandas-based Compare. The comparison results are then aggregated to produce …
Installation - datacompy 0.19.2 documentation - GitHub Pages
PyPI (basic) ¶ pip install datacompy A Conda environment or virtual environment is highly recommended: conda (installs dependencies from Conda Forge) ¶ conda create --name datacompy …
datacompy - datacompy 0.19.2 documentation - GitHub Pages
datacompy.core module Compare Compare.all_columns_match() Compare.all_mismatch() Compare.all_rows_overlap() Compare.count_matching_rows() Compare.df1 …
Polars Usage - datacompy 0.19.2 documentation - GitHub Pages
DataComPy’s implementation of Polars is very similar port of the Pandas version. There are some differences you should be aware of as they may yield slightly different results.
Developer Instructions - datacompy 0.19.2 documentation
For datacompy we want to use a simple workflow branching style and follow Semantic Versioning for each release. develop is the default branch where most people will work with day to day.
datacompy.spark package - datacompy 0.19.1 documentation
datacompy.spark.helper.sort_rows(base_df: DataFrame, compare_df: DataFrame) → DataFrame ¶ Add new column to each DataFrame that numbers the rows, so they can be compared by row number.
Template Customization Guide - datacompy 0.19.2 documentation
DataComPy allows you to customize the report output by providing your own Jinja2 templates. This guide explains how to create and use custom templates for comparison reports.