PROFILE
EDUCATION
12th Grade
CAREER
D a t a E n g i n e e r
Mar 2020 - Aug 2020
Terra Economics & Analytics Lab (TEAL) l Bangalore
Data Engineer Intern
As an intern, I gained an in-depth understanding of real-time workflows and
industry standards, while exploring new tools on a daily basis. This gave me a
head start on my career and led to a promotion to a full-time role, where I
became a key member of the team. I leveraged my skills to help develop a
new product, building one of its core components from scratch. I'm excited to
bring my persistence, diligence, and innovation to my next role.
Aug 2020 - Dec 2022
Data Engineer
Ingesting data in substantial quantities from diverse sources into back-end
databases through scalable scripts.
Cleaning impure data derived into a company-wide set standards to
transform scrambled data into categorized and meaningful information
Developing various utilities internally to solve complex problems in
scripting, influencing re-usability of the code for recurring problems
Working with cloud computing platforms like GCP and AWS on services
like EC2, S3 & RDS etc.
Systematically maintaining GitHub issues during script refactoring for
tracking updates and maintaining transparency
Writing clean & understandable documentation
Experience Overview
SKILLS
Core
Essentials
International Indian School
Jubail, Saudi Arabia
2015
Bachelors of Computer Application
MS Ramaiah College
Bangalore, India
2016-2019
Notable Personal Acievements
Responsibly managed data collection and curation for 7 million+ rows
and various schemas that powered the property valuation determination
model
Took initiative in building and configuring a company-wide documentation
platform and migrating internal docs into a wiki-based service using wiki.js
Used Google Cloud Maps API to design sophisticated infrastructure for
on-demand property geocoding based on search queries. Verifying the
correctness of said queries using the Levenshtein distance algorithm.
Helped create a nationwide data coverage info metric using QGIS for
geospatial analysis
Took initiative to devise a cost-cutting, resource-saving solution for
interruptions on EC2 Instances. Developed an automated tool using SNS,
CloudWatch, Lambda & Webhooks that monitors and pushes critical
alerts & reports on our Slack channel.
Notable Personal Acievements
Cron
Debugging
Documentation
OOP
Testing
.ipynb
SQL
RabbitMQ
Git
Tmux
Python
Linux
AWS
Google Cloud APIs
Selenium
YAML
XML
Docker Logging
Migration Queues
CLI
SSH
QGIS
Wiki.js Postgres SQLite
Metabase Solr
OCR
Wordpress
As a Data Engineer with a passion for building end-to-end data pipelines, I
have been instrumental in powering a range of frontend, analytics, and
market intelligence products. Over the past 3 years, I have been part of an
exciting journey with a real estate-based early-stage startup, contributing my
skills and creativity to achieve numerous significant milestones. From
inception to execution, we have consistently solved complex problems, and I
am excited to share my experience and expertise with you.
ETL
Asif Farhan Khan
Pandas
asiffarhankhan@outlook.com
linkedin.com/in/asiffarhankhan
github.com/asiffarhankhan
Jamshedpur, India
831012
COURSES
ACTIVITIES
EXPLORATORY PROJECTS
University of Helsinki
Elements of AI
Data Analytics Using R
Flex Analytics
3D Game Designing with Unity
RoboKart IIT
State level conference
Research on Cryptocurrencies
Toastmasters Club
Public Speaking
Casual Blogging
Wordpress
Deep Learning
Binary Image Classification
Worked on a binary image classifier designed using Keras/Tensorflow to
predict the class label of a given sample image. The task required core
understanding of a Convolutional Neural Network and various
mathematical functions that are involved in the actual working of any Deep
Learning model. Throughout the project, I had picked up on various ML
techniques such as:
Data Collection, The task was automated using the request library,
making the whole process very efficient
Data Augmentation, the classifier was built on a very little data, this was
overcome by applying Augmentations on each of them to increase the
dataset exponentially
Data Preprocessing, involved some alterations to the image such as
converting files to arrays that would make the whole training process faster
Designing/Training/Testing the classifier, The classifier was designed
using sequential model in keras yielding 92% accuracy.
Machine Learning
Sentiment Ananlysis
Combined Machine Learning and NLP to train a classifier and to analyze
the sentiment of the headline with Naive Bayes Classifier. The Program
would take in the URL of a news article and output if the sentiment of that
article headline is either Positive or Negative.
Worked on some Web Scraping using BeautifulSoup3, Newspaper and
others such as urllib, textblob for various functionalities
Made use of Github features and have gained ample amount of
experience with forking, pushing, commiting, and making pull requests
using command line
Started this project as a Fake News Detector working with 15 people
from around the world.
Automation/API
Tweepy
Worked with Tweepy library on python to write a script that utilises the
twitter API to browse through all the tweets of a specific user's Twitter
account and to delete all the retweets that is encountered along the way.
Gaining some exposure on working with APIs and making API calls
Learned configuring setup.py for easy installation of dependencies and
packages.
Automation/API
Web Scraping
This script aims to scrape the data from the online article publishing platform
medium.com. It was specifically meant for learning the basics of web
scraping and is aimed to scrape the article text from a given URL of a paid
article and reproduces the same article on a new browser tab, making it
readable without having to pay for a subscription
Required basic knowledge of writing HTML
Worked with libraries such as OS, newspaper, textile, webbrowser,
urllib and request.