Download Zeeshan - Corp to Corp

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Nonlinear dimensionality reduction wikipedia , lookup

Transcript
DATA SCIENTIST
ZEESHAN
[email protected]
914-765-9585
SUMMARY:
•
•
•
•
•
•
•
•
•
•
•
•
•
•
Individual passion for data with Around 5years of experience in data analytics, data sciences and
data modelling
Well versed in data extraction (such as HDFS, Azure blob and Web services) and data ingestion
for social media toclean, process and to conduct trend and sentimental analysis.
Knowledge of Statistical Analysis and Modelling Techniques for Predictive Modeling such as
Regression, Auto-correlation, Analysis of variance,Z-test, P-test, Simpson’s Paradox,
hypothetical testing, normal distribution, poisons distributions,contionus distributions and
random distributions
Experience in data analysis such as descriptivestatistics quantile plot, percentile distribution,
removing outliers and extreme values and missing values in the data.
Experience in data preparation flow such as data ingestion, cleansing, normalization,
identifying correlations and removing outliers.
Good knowledge and experience in writing complex queries usingSQLon various RDBMS.
Analysis of datavisualization using tools like R,Microsoft Power BI,Tableau, Azure Machine
Learning (AML)
Worked under the mentorship of Senior data scientists.
Experienced in machinelearning algorithms in R andAzure ML using model building steps which
includes clustering, classification, regression, recommendation systems.
Developing of predictive models using decision making methods as Bagging for classification,
Boosting for classificationand Random forest for regression to improve the efficiency of the
predictive model.
Well versed inGithub repositoryfor version control.
Familiar with streaming analysis with IoT on Microsoft Azure with servicesHD insights, Blob
storage, DataLake, Power BI.
Used advanced Excel functions to generate spreadsheets and pivot table.
Experienced in preparing detailed documents and reports while managing complex internal and
external data analysis.
Work experience:
BLUE CROSS BLUE SHIELD-Eagan, MN July 2015-Present
Data Scientist
Fraud detection analysis
Responsibilities:
•
•
•
•
•
Collected and explored clinical data, patient historical data, patient’s prescribed biological and
pharmaceutical data to identify correlations and help with modeling to identify
fraudulentinsurance claims submitted by doctors and hospitals.
Cleaning of missing data using featured engineering.
Validating and correlating the collected information using visualization techniques.
Worked on power BI dashboard for product measuring
Visualize, interpret, report findings and develop strategic uses of data Using R, Power BI.
•
Followed data flow path based on Machine Learning platform using Naives Bayes algorithm and
usingclassification as building model
• Normalize the data and transform all the values to a common scale.
Technology Stack: R, SQL, Azure Machine Learning, Microsoft Power BI, Blob Storage, Github
What was to be predicted?
To predict suspicious claims submitted by doctors and hospitals which have a higher possibility of being
fraudulent.
What action was taken when the prediction was TRUE?
Once the claim was run against the Model and if it was predicted as possible fraudulent, they were put
on hold and submitted for internal auditing. Many data elements (features) were analyzed across
patient’s Clinical data, diagnosis by physicians, relating to defined prescriptions that need to be ordered
for diagnosis versus what were actually prescribed.
How was the Model measured?
A PowerBI Dashboard was created to track the counts of Fraudulent Claim versus Internal Audit
observations. The dashboard provided KPIs of Success Claims Identified by the model and the dollars
saved.Visualizations showed trend week over week and month over month data.
CSX- Jacksonville, FL Jul 2014 - Jun2015
Junior Data Scientist
Transport data analysis
Responsibilities:

Collecting the geographical location data, incoming of sensor data of Track and locomotive,
geometry of the wheel axle.
 Data exploration was conducted to help in data preparation for model development.
 Validating and correlating the collected information using visualization techniques and using
Tableau
 Creating dashboards, interpret report findings and develop strategic uses of data using
Microsoft Power BI.
 Understanding of historical data using classification model using Azure Machine learning
algorithm by colerelation of collected data.
 Mentored by senior data scientist for creating model patterns.
Technology Stack: R, Microsoft Azure ML,MicrosoftPowerBI, dplyr, ggplot, Azure SQL, Github
What was to be predicted?
To predict track related issues that could cause derailments and wheel and bearing damage.
What action was taken when the prediction was TRUE?
On predicting the issue, interconnected sensors called super-sites were deployed in order to study the
movements of the cars as the trains passed through super-sites depending on geometry, acoustic
sensors, AEI sensors (Automatic equipment identification) and geographical locations.
How was the Model measured?
Reduction in incidence of derailments and wheel- and bearing-related failures. The key was to be able to
catch the failure before it happens and increase the notice time by which maintenance personnel could
act on the alert. Microsoft PowerBI Dashboard was used to create Data Visualizations on plotting
possible track failures on a map to show the 21,000 Miles of track information and locomotive positions.
OSI CONSULTING-Hyderabad, INDIA Aug2010-May2012
Data Analyst
Responsibilities:

Collection, tracking and analyzing of data.

Writing SQL queries and scripts to extract and aggregate data to validate the accuracy of the
data.

Performing daily integration and ETL tasks by extracting, transforming and loading data to and
from different SQL Server databases.

Prepare high level analysis reports withMicrosoft Excel.

Used advanced Microsoft Excel to create pivot tables using MS Excel functions using
packageas Data Analysis.

Optimized data collection procedures and generated reports on a weekly, monthly, and
quarterly basis.

Successfully interpreted data to draw conclusions for managerial action and strategy.

Used statistical techniques for hypothesis testing to validate data and interpretations.

Presented findings and data to team to improve strategies and operations.
Technology stack:SQL, MS-EXCEL, MS SQL
Education:
Bachelor’s in Mechanical engineering,Osmania University, 2010
Master’s in Mechanical engineering, Fairfield University, 2014
Core Skills:
•
•
•
•
•
•
•
Data Sources: Azure Blob, Data Lake, HDFS, SQL Server, Excel
Programming Languages: R,SQL
Data Visualization: R, python, MS Power BI, Azure Machine Learning
Data Exploration: R, Azure ML
Cloud Services: Azure ML, MS Power BI, Azure IoT hub ,HD Insight, Streaming Analytics
Cloud Platforms: Azure, AWS
Repository: Github