SANTTUcurriculum vitae
09 Jul 2019

Must have Chrome extensions for Data Scientists

Browser extensions are software programs that enable you to tailor/add more functionality to your web browser, without altering any of the native code. The success of Google Chrome (with ~ 62.7% browser market share) lies in its extensibility. You can have a plugin or extension for just about everything you may ever possibly want.

Here are the must-have chrome extensions for data scientists to use in their daily workflows.

Web Scraper

Data is what data scientist rely upon for building their models. State and the volume of data are often very crucial in determining the prediction accuracy. Lack of sufficient data (sparse) or learning from poor data source would result in overly simplistic predictions/solutions. Provided with the right tools to procure and analyse, data scientists can take leverage on petabytes of data available on the web. However, in reality, the data extraction from these sources often requires scripts to scrape.

Web Scraper is a tool which can extract text, numbers, content from a webpage. Their Chrome browser extension is built, especially for data extraction from web pages in minutes and for free. Using this extension, you can create a plan (sitemap) on how we can traverse a web site and what to extract. Using these sitemaps, the Web Scraper then navigates the website accordingly and extract relevant data. Scraped data later can be exported as CSV. The core features of Web Scraper that places this extension to the must-have list are the ability to scrape multiple pages, various data selection types, data extraction from dynamic pages (JavaScript+AJAX) and it’s export capabilities.

DECS

We all crawl the web in search of code. Data scientist often rely on scripts to perform and automate repetitive tasks. How many times have we painfully repeated steps to land on that same piece of code again? These scripts/code snippets typically used for/in pre-processing or machine learning (classification or regression or clustering) pipelines usually remain the same. So rather than remembering the complete details about a specific pipeline and re-writing the same code across several experiments/applications, it helps to reuse, refactor, repurpose and review the code snippets.

DECS is a decentralised and end-to-end encrypted tool for managing code snippets. There are a few code snippets managers available, but data end-to-end encryption and one-click code copy feature set it apart from the rest. A code snippet is intellectual property, and it’s essential that modern tools give more importance to privacy and user data ownership ( given a choice I wouldn’t prefer to store my code on someone else’s cloud/server where they could look into the data). With DECS – Code snippets manager, you can save the code snippets fully encrypted (on their server or anywhere you would want to), and use it across several projects.

Since the data is encrypted, you can store your configs, including the API keys, and is always just a search away, saving a lot of time, effort and money. To add cream on top, with DECS browser extension, using just a single click, you can capture the snippets that catch your eye (anywhere on a webpage including Stack Overflow) and store them forever for future use.

Diigo Web Collector

Diigo is a useful extension for annotating, archiving and bookmarking webpages. With this easy-to-use tool, you can bookmark links to archive webpages or read later. Several new papers related to data science on Arxiv, Medium blogs, can be bookmarked as well as can be annotated with highlights & stickies for essential notes.

The annotations can also be shared via social media for e.g., via Twitter and Linkedin. Moreover, it can be accessed anywhere, via iPhone, iPad (Appstore), Android (Playstore).

All in all, this chrome extension, can be used to create bookmarks, groups to pool findings, share resources and finally for curating content, especially for building your training data.

These are the three must-have chrome extensions for data scientists to use in their daily workflows. What do you think of the list? Let me know if you have any other chrome extensions that are super useful for a data scientist.

Artificial Intelligence • Design Leave a comment

Leave a Reply

%d bloggers like this: