In NLP, part-of-speech tagging is a process in which you mark words in a text (aka corpus) as corresponding parts of speech (e.
The other weekend I implemented a simple sentiment classifier for tweets in Kotlin with Naive Bayes.
In the final part of my series on ML model evaluation metrics we’ll talk about metrics that can be applied to regression problems.
Hi! Welcome back to the second part of my series on different machine learning model evaluation metrics.
If you’re in the beginning of your machine learning journey, you may be taking online courses, reading books on the topic, dabbling with competitions and maybe even starting your own pet projects.
Whether you want to do an exploratory data analysis, or train a machine learning mode, the first thing you inevitably will have to do is clean the data you’ve got.
Pandas is an essential library in Data Scientist’s toolbox. If you’re just starting to learn, you’ll find a lot of great intro tutorials that’ll help you make your first steps with it.
In this last part of “getting data” sub-series, I want to mention, without going into too much detail, one more way of obtaining data that you may need for your Data Science project.
In my previous blog post I’ve talked about getting data from a csv file (even if it’s messed up), or a database.
Quite obviously, data science is not really possible without data. Before you can start munging your data, visualizing it, training models on it, you need to get your hands on it first.