Hello, I’m Avijit!

I am a Ph.D. Student at the Khoury College of Computer Sciences at Northeastern University, Boston, advised by Dr. Alan Mislove. I am an interdisciplinary researcher, an opensource developer and an aspiring scientist. I strive to find answers to questions about human behavior on modern data driven platforms and ethics in AI. I am a proponent of social justice and technology's role in addressing inequality.

Research Interests

  • Computational Social Science
  • Ethical AI
  • Privacy
  • Information Retrieval
  • Natural Language Processing
  • Machine Learning

Portfolio

WebSelect

Maximizing the reach of advertisements based on Network Structure

View More

Webselect

Complex Networks, Web Crawling, Advertisements, Genetic Algorithms, Python

Supervisors
Prof. Uttam Sarkar, MIS, IIM Calcutta and Prof. Agam Gupta, IIM Rohtak
Duration
Mar – May 2016
Demo Link
webselect.herokuapp.com

Designed a tool to select the best subset of websites to maximise the reach of advertisements, within budget and demographic limits, from a graph of websites by scraping traffic information from Alexa. Genetic Algorithm used to optimize the selection problem as the original problem is NP-Hard.

Image source: The Internet Map

Tweet Situationality

Extracting informative tweets from noise during a disaster

View More

Tweet Situationality

NLP, Twitter, Classification, Feature Engineering, Disaster

Supervisor
Prof. Niloy Ganguly, IIT Kharagpur
Duration
Nov 2015 – Jan 2016
Paper
ACM TWEB

A classifier to separate disaster related tweets into Situational and Non Situational Classes, using sentiment detection, dependency graphs and linear patterns for feature engineering. Work showcased at IBM Demo Day 2016. Also part of a situational summarizer project published in ACM Transactions on the Web.

Savitr

System for Real-time Location Extraction from Microblogs during Emergencies

View More

Savitr

Twitter, Geolocation, Dash, Python, MongoDB, NLP, ML, Feature extraction

Supervisor
Prof. Saptarshi Ghosh, IIT Kharagpur
Duration
Jul 2017 – Jan 2018
Web Demo
http://savitr.herokuapp.com
Paper
WWW-SMERP 2018

Aa system that leverages the information posted on Twitter to monitor and analyse emergency situations. We infer the locations mentioned in the microblog text, in an unsupervised fashion and display it on a map-based interface. The system is designed for efficient performance, achieving an F-score of 0.79, and is approximately two orders of magnitude faster than other available tools for location extraction.

SMA

How News and Word of Mouth Affects Stock Price

View More

SMA

Finance, Market prediction, ML

Supervisor
Prof. Abhijeet Chandra, IIT Kharagpur
Duration
Jan 2017

Project to crawl social media (Facebook/Twitter) and top financial news websites to find the top posts and news articles that can affect share prices the most. This was used in the demo built for Inter IIT Tech Meet 2017 where Won the Silver Medal at the Stock Market Analysis Event.

Cascade

Modeling Connectedness of Firms in Financial Markets with Heterogeneous Agents

View More

Cascade

Complex Networks, Finance, Failure Prediction, Artificial Stock Market

Supervisor
Prof. Abhijeet Chandra, IIT Kharagpur and FNA.fi, UK
Duration
Jul 2018 onwards

The project goal is to discover connectedness and study heterogeneous agents in an financial network, by modelling the decomposition of volatility spillover or variance through networks. Right now we are discovering patterns in the WIOT Dataset where we measure cascading failure in a time series network of trading data between countries from 2000-2014. This is one of the 9 projects awarded the SGSIS Students' Challenge grant (worth INR 1 Million) in the institute among 30 finalists from around 1000 submitted projects.

Image source: Bloomberg

Facebook Ad Transparency

A sharper look at advertising and targeting patterns on Facebook

View More

Facebook Ad Transparency

Facebook, Propublica, Advertisements, Behavior Patterns, Transparency, Controlled experiments

Supervisor
Prof. Alan Mislove, Northeastern University
Duration
May - Jul 2018

Study of advertiser behavior and targeting patterns on Facebook by using the ad reach information obtained from Facebook’s ad transparency feature and the personal targeting dataset from Propublica’s Facebook ad dataset, aided with controlled ad placement experiments.

Targeted Commenting

Selective data driven commenting system for Online News Media

View More

Targeted Commenting

NLP, Deep Learning, Web Design

Duration
Jun – Oct 2018
Paper
Paper

Most of the user comments on online news articles are relevant to only particular sections of an article. In this work, we build a system which can automatically classify and display comments against relevant sections or paragraphs of an article. To implement that, we develop a deep neural network based mechanism to find comments relevant to any section and a paragraph wise commenting interface to showcase them. Such a data driven commenting system can help news websites to further increase reader engagement.

Madad

The all purpose SOS App

View More

Madad

Android, Geolocation, SOS

Duration
Feb 2016
Link
Github

Security is increasingly important in the modern world today, and thanks to the penetration of smartphones and technology we can use them to solve a large number of problems. We developed an android application to allow people facing any kind of peril to signal for help to their preferred contacts and also to people nearby. This app won the Gold Medal in the Inter IIT Tech Meet, Software Development Event. I was responsible for the App frontend and rest api integration.

Molecule2Vec

Vector Space Representation of Organic Molecules

View More

Molecule2Vec

Doc2Vec, Deep Learning, Regression, Chemistry

Supervisor
Prof. Debasis Sarkar, IIT Kharagpur
Duration
2017 - 2018

Converted 3D Molecules to a Vector Space Model using Doc2Vec. Using the trained molecule vectors and the solubility data, I trained ML and Deep Learning Algorithms to predict solubility within an error of 0.3 g/litre. Almost 65% of the solubility variance was explained by the structure vectors exclusively. This work was accempted and presented at the EUCHEMS 2018 conference by the Royal Society of Chemistry.

Internship Experience

May 2019 - Present

Visiting Researcher

Study of how news companies promote different items on social media, investigating possible patterns of differential use.

Supervisor: Dr. Oana Goga
May - July 2018

Visiting Researcher

Study of advertiser behavior and targeting patterns on Facebook by using the ad reach information obtained from Facebook’s ad transparency feature and the personal targeting dataset from Propublica’s Facebook ad dataset, aided with controlled ad placement experiments.

Supervisor: Prof. Alan Mislove
Dec 2017 - Jan 2018

Data Science Intern

Automatic PDF report generation system by reading data from company database. Fraud likelihood prediction by running simple statistical models and ML on credit history of consumers provided by client companies.

Supervisor: Gaurav Jain
May - July 2017

Summer Research Intern

- Implemented XTrack, a Smart Vehicle Tracking and Battery usage minimizing Algorithm.
- Uber Surge Price Prediction using Spatio-Temporal techniques like the Neural Hawkes and Recurrent Marked Temporal Point Process. Was given the Best Internship Project award for this project.

Supervisors: Narendra Annamaneni, Poorvi Agrawal
May - Aug 2016

Google Summer of Code Student

Replaced the HTML XForms system used in the Android app with native generated forms using the Forms REST Api and added offline form saving. Configured Travis CI to automatically build and push the apk to play store.

Supervisors: Rafal Korytkowski and Robert O’Connor

Education

Present

Northeastern University, Boston

Ph.D. in Computer Science
2019

Indian Institute of Technology, Kharagpur

B.Tech. in Chemical Engineering
M.Tech. in Financial Engineering
Minor in Computer Science

Publications

Accepted

  • ConPro 2019  "Analyzing Political Advertisers’ Use of Facebook’s Targeting Features." -ieee

    at the Workshop on Technology and Consumer Protection (ConPro) 2019, San Francisco, California, USA

    Avijit Ghosh, Giridhari Venkatadri, Alan Mislove

  • ECIR 2019  "Public Sphere 2.0: Targeted Commenting in Online News Media." -springer

    at the European Conference on Information Retrieval (ECIR) 2019, Cologne, Germany

    Ankan Mullick, Sayan Ghosh*, Ritam Dutt*, Avijit Ghosh*,Abhijnan Chakrabarty

  • WWW 2018  "SAVITR: A System for Real-time Location Extraction from Microblogs during Emergencies." -acm

    at the WWW 2018 workshop on Exploitation of Social Media for Emergency Relief and Preparedness (SMERP) - 2018 Lyon, France

    Ritam Dutt, Kaustubh Hiware, Avijit Ghosh, Rameshwar Bhaskaran

  • EUCHEMS 2018  "Molecule2Vec: Vector Space Representation of Organic Molecules for prediction of properties using Deep Neural networks.”-slides

    at the European Congress of Chemical Sciences (EUCHEMS) 2018, Liverpool, UK

    Avijit Ghosh, Debasis Sarkar

  • WITS 2016  "WebSelect: A Research Prototype for Optimizing Ad Exposures based on Network Structure."-arxiv

    at the Workshop on Information Technology and Systems (WITS) 2016, Dublin, Ireland.

    Avijit Ghosh, Agam Gupta, Divya Sharma, Uttam Sarkar

Media Coverage

  • Online   ”Algorithms that "Don't See Color": Comparing Biases in Lookalike and Special Ad Audiences ” - arxiv

    Media articles: Propublica, Mother Jones

    Piotr Sapiezynski, Avijit Ghosh, Levi Kaplan, Alan Mislove and Aaron Rieke

Miscellaneous

  • ”Supervised extraction of catchphrases from legal documents.” - pdf

    Avijit Ghosh*, Prerit Gupta*, Ritam Dutt, Kaustubh Hiware, Arpan Mandal, Kripabandhu Ghosh and Saptarshi Ghosh

  • "Connectedness of Markets with Heterogeneous Agents and the Information Cascades." -abstract

    Avijit Ghosh, Aditya Chourasiya, Lakshay Bansal, Abhijeet Chandra

  • ”Using Global Vectors in Social Interaction Network for Song Recommendation.”

    Avijit Ghosh*, Sayan Ghosh*

* Equal contribution

Choose Demo