Datawaza#
Datawaza is a collection of tools for data exploration, visualization, data cleaning, pipeline creation, model iteration, and evaluation. It builds upon core libraries like pandas, matplotlib, seaborn, and scikit-learn.
Modules#
Installation#
The latest releases can be found on PyPi. Install Datawaza with pip:
pip install datawaza
See the Change Log for a history of changes.
User Guide#
User Guide is a Jupyter notebook that walks through how to use the Datawaza functions. It’s probably the best place to start, and then you can reference the function specs organized by module above.
Source Code#
You can find the Datawaza repo on Github. Please submit any issues there. It’s distributed under the GNU General Public License. Contributions are welcome!
What is Waza?#
Waza (技) means “technique” in Japanese. In martial arts like Aikido, it is paired with words like “suwari-waza” (sitting techniques) or “kaeshi-waza” (reversal techniques). So we’ve paired it with “data” to represent Data Science techniques: データ技 “data-waza”.
Origin Story#
Most of these functions were created while I was pusuring a Professional Certificate in Machine Learning & Artificial Intelligence <https://em-executive.berkeley.edu/professional-certificate-machine-learning-artificial-intelligence> from U.C. Berkeley. With every assignment, I tried to simplify repetitive tasks and streamline my workflow. They served me well, so I’m publishing this library in the hope that it may help others.