I love to do Analysis of Data related to India. There is a ton of data available on Kaggle and they are a great way of practicing the stuff that I learned. The fun thing about Kaggle is you can publish your entire analysis along with code in a Jupyter Notebook for others to see. In this way, you can get great feedback from the community, and also it helps you keep being engaged in your process of learning Data Science. So, I recommend you to do that if you're in the process of learning Data Science.
Now, before I…
Since I started my journey to become a Data Scientist I learned a ton of useful stuff that I found crucial when we are dealing with data from multiple sources. Among them, I feel Text/String matching to be one of the most useful techniques to know.
To explain the importance of Text/String matching let me present you with a scenario. I have a Data Frame named Indian_Crime_Data representing India’s Total Crimes under the Indian Penal Code in each State in the year 2011.
An aspiring Data Scientist( Yes, it’s a classy way of saying I am a Noob. LOL) happy to share my struggles and knowledge.