Dev Center

Dev Center Home

June 29, 2022

Analyze Twitter’s Reaction to Taylor Swift with HarperDB (Part 1)

Colorful origami dog with purple, pink, green tones of paper.

Written by

Aakriti Sharma

Community Collaborator

Open in Medium.com

Welcome to Community Posts

Click below to read the full article.

Open in Medium.com

Summary of What to Expect

Table of Contents

‍

This project aims at extracting tweets that include the topic ‘Taylor Swift’ and performing a detailed analysis by exploiting natural language processing.

1. Data Collection:

• Install Twint library for Python to scrape tweets with the keyword "Taylor Swift."

• Use Twint to scrape tweets and store them in a CSV file.

2. Data Preprocessing:

• Handle null values by dropping unnecessary columns and rows.

• Identify the primary key for the dataset.

• Preprocess tweet text by removing hashtags, URLs, mentions, emojis, punctuation, and digits.

• Tokenize the cleaned tweets and remove stop words.

3. Save the cleaned data in a CSV file.

4. Set up HarperDB:

• Create a HarperDB account and instance.

• Launch the instance, create a schema, and define tables.

• Import the cleaned data into HarperDB.

‍

See part two of this series as well.

‍

Dev Center Home

Open in Medium.com

Tutorial

Relationships and Querying in HarperDB 4.3

Kris Zyp

March 2024

Tutorial

Developing Internal Tools Rapidly With ToolJet and HarperDB

Mostafa Ibrahim

August 2023

Looking for more?

With streaming, database, and application server in one, building anything better is easy. Let us know what you are looking to build with HarperDB, we are happy to help.

Shorten Your
Development Cycle

Join the HarperDB Community