Natural Language Processing (NLP) using Python

NLP101
img

Natural Language Processing (NLP) using Python

About this Course

There is a surplus of data all around us, but according to industry estimates, only 21% of the available data is present in structured form where information is directly accessible without processing. Organizations today deal with huge amount and wide variety of data: calls from customers, their emails, tweets, data from mobile applications and what not. It takes a lot of effort and time to make this data useful. One of the core skills in extracting information from text data is Natural Language Processing (NLP).
Natural Language Processing (NLP) is the art and science which helps us extract information from text and use it in our computations and algorithms. NLP can power many applications, such as language translation, question answering systems, chatbots and document summarizers. Given the increase in content on internet and social media, it is one of the must have skill for all data scientists out there.
This course is designed for people who are looking to get into the field of Natural Language Processing. It provides you everything you need to know to become an NLP practitioner.

Pre-requisites

As this is a more practical and advanced course, it is required that you have a good grasp on the basics of Machine Learning. Also, familiarity of Python language is an added benifit (although this will be taught in the course)

Download Brochure

Syllabus

In this module, you will come across the wonderful field of Natural Language Processing(NLP). Here you will get a high level overview of NLP, and the tasks associated with it.

For a Data Scientist, having the knowledge of the right tool is very important, which can help him or her to convert ideas to practical working models. This module is a refresher for Python, an industry grade tool for doing NLP tasks.

The module covers the concept of Regular Expresssions and how they can be used to extract useful information from text.

In this module, you will learn about detecting named entities from text, which designate the most useful information from textual data, and how the technique of Named Entity Recognition is implemented using NLTK library in Python.

In this module, you will learn about topic modelling, a technique used for finding hidden patterns among the words in a text.

Raw text has to be cleaned, before we do predictive modeling on textual data. This module shows you the best practices of cleaning noisy text.

In this project, you will get to apply the learnings of the previous modules to clean tweet data, and analyze it.

To analyse a processed textual data, it needs to be converted into features. Depending upon the usage, text features can be constructed using assorted techniques: Syntactical Parsing, Entities / N-grams / word-based features, Statistical features, and word embeddings. You will get to know them in this module.

  • Introduction to Deep Learning (Optional)
    To briefly understand the basics of deep learning. In this module, we will learn what a neural network is, how forward and backward propagation works for a neural networks. We will also look at how convolutions work in Convolutional Neural Networks(CNNs). We will also understand the concepts of Recurrent Neural Networks(RNNs) and Long Short Term Memory(LSTM).
  • Deep Learning for NLP
    Now you are familiar with the field of NLP, it is crucial for you to be aware of the state-of-the-art techniques used to solve NLP tasks. You will get to know the basics of deep learning techniques, and how deep learning techniques can be leveraged to push the boundaries of what a machine can do to solve NLP tasks.

  • Machine Learning Algorithms
    To get an overview of machine learning algorithms. We will cover most of the algorithms used for classification problems.
  • Understanding Text ClassificationThis module talks about a specific problem in the field of natural language processing, called text classification. Text classification is widely used in solving real life problems such as Email Spam Identification, Topic Classification of news and Sentiment Analysis

In this project, you will get to apply the learnings of the previous modules to clean tweet data, and analyze it.

In this project, you will practice how NLP techniques can be applied to solve a real life problem of Spam Detection in SMS, where you have to classify SMS text message as spam or non-spam.

This is the final project of the course, where you will get a chance to test your knowledge of the techniques covered in the course. Here your task is to classify racist and sexist tweets from other tweets.

Projects

img
Project1 : Social Media Information Extraction

This project is designed to teach you how to extract relevant information such as entities, ngrams, keywords and sentiments from social media data using NLP techniques. The project highlights the importance of nlp techniques to extract business insights from the text data.

img
Project2 : SMS Spam Classification

This project is about the classification of SMS text messages as spam or nonspam. In this project, the students will learn to preprocess, feature engineering techniques, and text classification techniques using machine learning models and the CNN model.

img
Project3 : Hate Speech Classification

Hate speech is an unfortunately common occurrence on the Internet. Often social media sites like Facebook and Twitter face the problem of identifying and censoring problematic posts while weighing the right to freedom of speech. The importance of detecting and moderating hate speech is evident from the strong connection between hate speech and actual hate crimes. Early identification of users promoting hate speech could enable outreach programs that attempt to prevent an escalation from speech to action.
The objective of this task is to detect hate speech in tweets. For the sake of simplicity, we say a tweet contains hate speech if it has a racist or sexist sentiment associated with it. So, the task is to classify racist or sexist tweets from other tweets.

Instructors

img

Shivam Bansal

Shivam Bansal is an experienced full stack data scientist with more than 5 years of experience. He has led the development and execution of multiple end-to-end data science and analytics products for a number of clients from Insurance, Healthcare, Retail, and Academia domain.
He has an extensive experience with natural language processing and unstructured data analysis. He is currently ranked 2nd in Kaggle Kernels ranking. He is an author of a book chapter on Deep Learning and has also shared a number of top viewed articles on AnalyticsVidhya.

Frequently Asked Questions

This course is for people who are looking to get into the field of Natural Language Processing, or those who want to brush up their knowledge of NLP and get familiar with the trends in the field. The course provides you everything you need to know to become an NLP practitioner

The course assumes prior background in Machine Learning. So we would recommend you to be aware of basics of Machine Learning before going through this course.

Yes, you will get information about all installations as part of the course.

Yes, you will get information about all installations as part of the course.

The fee for this course is non-refundable.

We would highly recommend taking the course in the order in which it has been designed to gain the maximum knowledge from it.

Yes, you will be given a certificate upon satisfactory completion of the course.

Fee for this course is INR 10,999

You will be able to access the course material for six months since the start of the course.

This is an online self-paced course, which you can take any time at your convenience over the 6 months after your purchase.

₹10999($169)

enroll now
  • Status Active

Highlights

  • Projects 3 Real Life

Support

For people undergoing the course, you can call us any time between 9AM - 5PM on Weekdays Monday - Friday on +91-8368253068 or email us on training_queries@analyticsvidhya.com

Enroll
×