Data Scientist // Plano, TX
Date: Fri, 26 Jul 2019 07:41:39 -0700 (PDT)
Message-ID: <71846597-8cc4-4d25-a6ed-3f26ced59c99_at_googlegroups.com>
Role: Data Scientist
Location: Plano, TX
Duration: Long-Term
Primary Job Responsibilities
Ø Assess the deployment of data science environment constraints
Ø Analyze and recommend tools that meet system requirements
Ø Select the development environment
Ø Create and Configure an Azure data science environment
Ø Define technical success metrics and quantify risks
Ø Transform data into usable datasets through development of data structures, designing a data sampling strategy, designing the data preparation flow
Ø Perform Exploratory Data Analysis (EDA) through review of visual analytics data to discover patterns and determine next steps, identify anomalies, outliers, and other data inconsistencies and create descriptive statistics for a dataset
Ø Cleanse and transform data, resolve anomalies, outliers, and other data inconsistencies, standardize data formats and set the granularity for data
Ø Perform feature extractionalgorithms on numerical data, non-numerical data and scale features
Ø Perform feature selection, define the optimality criteria and apply feature selection algorithms
Ø Develop models:
ü Select an algorithmic approach, determine appropriate performance metrics implement appropriate algorithms, consider data preparation steps that are specific to the selected algorithms ü Split datasets, determine ideal split based on the nature of the data, determine number of splits, determine relative size of splits, ensure splits are balanced ü Identify data imbalances, resample a dataset to impose balance, adjust performance metric to resolve imbalances, implement penalization ü Train the model, select early stopping criteria, tune hyper-parameters ü Evaluate model performance, score models against evaluation metrics, implement cross-validation, identify and address overfitting, identify root cause of performance resultsBasic Qualifications (Minimum):
Ø Masters (minimum) or PhD (preferred), or advanced degree in Computer Science or related field Ø 3-4 years (with MS) or 1-2 years (with PhD) of experience manipulating data sets and building statistical models using statistical computer languages (R, Python, SQL, Scala etc.) Ø Knowledge of advanced statistical techniques and concepts (regression, distributions, statistical tests and proper usage, etc.) and experience with their application Ø Knowledge of a variety of machine learning techniques (clustering, decision trees, artificial neural networks, etc.) and their real-world advantages and drawbacks Ø Knowledge and experience with Deep Neural Net modeling frameworks– TensorFlow, PyTorch, Caffe etc. are preferable
-- Warm Regards, Shankar Allamsetti | Senior Recruiter P: 281-823-9222 Ext 517 | E: Shankar.allamsetti_at_3sbc.com 11271 Richmond ave,Suit #107,#108,Houston,TX-77082 3S Business Corporation. www.3sbc.com ****Best way to reach me through email****Received on Fri Jul 26 2019 - 16:41:39 CEST