Remote apache-spark Jobs

This Month

Paid Research Study for Data Professionals
apache-spark data-warehouse google-bigquery amazon-redshift Jul 27

User Research International is a research company based out of Redmond, Washington. Working with some of the biggest companies in the industry, we aim to improve your experience via paid research studies. Whether it be the latest video game or productivity tools, we value your feedback and experience. We are currently conducting a research study called . We are looking for currently employed Data Professionals who have experience with cloud-based data warehouses. This study is a one-time Remote Study via an online meeting. We’re offering $150 for participation in this study. Session lengths are 90 minutes. These studies provide a platform for our researchers to receive feedback for an existing or upcoming products or software. We have included the survey link for the study below. Taking the survey will help determine if you fit the profile requirements. Completing this survey does not guarantee you will be selected to participate.  If it's a match, we'll reach out with a formal confirmation and any additional details you may need.

I have summarized the study details below. In order to be considered, you must take the survey below. Thank you!

Study: Cloud-Based Data Study

Gratuity: $150

Session Length: 90 mins

Location: Remote

Dates: Available dates are located within the survey

Survey: Cloud-Based Data Study Sign-Up

Share this job:

This Year

Backend Engineer Data Team
aws java apache-spark hadoop hbase backend Mar 26

Sonatype’s mission is to enable organizations to better manage their software supply chain.  We offer a series of products and services including the Nexus Repository Manager and Nexus Lifecycle Manager. We are a remote and talented product development group, and we work in small autonomous teams to create high-quality products. Thousands of organizations and millions of developers use our software. If you have a passion for challenging problems, software craftsmanship and having an impact, then Sonatype is the right place for you. We are expanding our Data team, responsible for unlocking insight from vast amounts of software component data, powering our suite of products enabling our customers from making informed and automated decisions in managing their software supply chain. As a Backend Engineer, you will lead or contribute to designing, development, and monitoring of systems and solutions for collecting, storing, processing, and analyzing large data sets.  You will work in a team made up of Data Scientists and other Software Engineers. No one is going to tell you when to get up in the morning, or dole out a bunch of small tasks for you to do every single day. Members of Sonatype's Product organization have the internal drive and initiative to make the product vision a reality. Flow should be the predominate state of mind.

Requirements:

  • Deep software engineering experience; we primarily use Java.
  • Database and data manipulation skills working with relational or non-relational models.
  • Strong ability to select and integrate appropriate tools, frameworks, systems to build great solutions.
  • Deep curiosity for how things work and desire to make them better.
  • Legally authorized to work (without sponsorship) in Canada, Colombia, or the United States of America and are currently residing in the corresponding country.

Nice To Haves:

  • Degree in Computer Science, Engineering, or another quantitative field.
  • Knowledge and experience with non-relational databases (i.e., HBase, MongoDB, Cassandra).
  • Knowledge and experience with large-scale data tools and techniques (i.e., MapReduce, Hadoop, Hive, Spark).
  • Knowledge and experience with AWS Big Data services (i.e., EMR, ElasticSearch).
  • Experience working in a highly distributed environment, using modern collaboration tools to facilitate team communication.

What We Offer:

  • The opportunity to be part of an incredible, high-growth company, working on a team of experienced colleagues
  • Competitive salary package
  • Medical/Dental/Vision benefits
  • Business casual dress
  • Flexible work schedules that ensure time for you to be you
  • 2019 Best Places to Work Washington Post and Washingtonian
  • 2019 Wealthfront Top Career Launch Company
  • EY Entrepreneur of the Year 2019
  • Fast Company Top 50 Companies for Innovators
  • Glassdoor ranking of 4.9
  • Come see why we've won all of these awards
Share this job:
Senior Java Software Engineer
Anonos  
java spring apache-spark docker kubernetes senior Jan 22

We are looking for a Senior Software Engineer to join the Anonos BigPrivacy team.

As a member of our engineering team, you will have responsibility over the ongoing development and maintenance of state-of-the-art data privacy software. You will make expert design decisions and technology recommendations based on your broad knowledge of modern software development.

We are a 100% remote organization. We use Slack and Zoom for communication, Ansible, TravisCI and AWS for CI/CD, and GitHub/ZenHub for tracking user stories. We work using the Kanban methodology, with monthly releases, and have regular backlog grooming meetings and retrospectives to continuously improve our processes.

Our software is implemented in Java, Kotlin, and JavaScript (Node.js). We are looking for someone with expert level knowledge of Java or Kotlin, and have an interest in working with server-side JavaScript. You should also be comfortable automating tasks, writing shell scripts, and working with Linux servers and cloud environments (primarily AWS). Some other technologies we use: Docker, Kubernetes, Apache Spark, Cassandra, Apache Kafka, MongoDB, React.js, Spring framework.

Anonos takes pride in its high-quality software so you must be committed to a high standard of development and testing. We expect you to think about programming tasks critically and develop code that is clean, reusable, efficient, well-documented, and well-tested. If you can explain what the SOLID principles are and why they are beneficial, how to properly go about refactoring, and compare and contrast various testing frameworks, then you will likely be a good fit for our team.

We are interested in speaking with exceptional people who can bring the following to the team:

- 8+ years of Java software development experience
- Expert-level proficiency with object-oriented design and programming
- 100% committed to test-driven development, this is your preferred practice for developing software

- Experience working with the Apache Spark data processing framework

- Experience with the Spring framework and Spring Boot applications
- Interest in learning new technologies and tools (especially related to big data)
- Comfortable working in an Ubuntu Linux server environment
- Proficiency with Git, Maven and Linux

Share this job:
Senior Data Scientist
r machine-learning python apache-spark cluster-analysis senior Jan 08

In the Senior Data Scientist role, you will have full ownership over the projects you tackle, contribute to solving a wide range of machine learning applications, and find opportunities where data can improve our platform and company. We are looking for an experienced and creative self-starter who executes well and can exhibit exceptional technical know-how and strong business sense to join our team. 


WHAT YOU'LL DO:

  • Mine and analyze data from company data stores to drive optimization and improvement of product development, marketing techniques and business strategies
  • Assess the effectiveness and accuracy of data sources and data gathering techniques
  • Develop and implement data cleansing and processing to evaluate and optimize data quality
  • Develop custom data models and algorithms to apply to data sets
  • Run complex SQL queries and existing automations to correlate disparate data to identify questions and pull critical information
  • Apply statistical analysis and machine learning to uncover new insights and predictive models for our clients
  • Develop company A/B testing framework and test model quality
  • Collaborate with data engineering and ETL teams to deploy models / algorithms in production environment for operations use
  • Develop processes and tools to monitor and analyze model performance and data accuracy
  • Ad-hoc analysis and present results in a clear manner
  • Create visualizations and storytelling
  • Communicate Statistical Analysis and Machine Learning Models to Executives and Clients
  • Create and manage APIs

WHO YOU ARE:

  • 3-5+ years of relevant work experience
  • Extensive knowledge of Python and R
  • Clear understanding of various analytical functions (median, rank, etc.) and how to use them on data sets
  • Expertise in mathematics, statistics, correlation, data mining and predictive analysis
  • Experience with deep statistical insights and machine learning ( Bayesian, clustering, etc.)
  • Familiarity with AWS Cloud Computing including: EC2, S3, EMR.
  • Familiarity with Geospatial Analysis/GIS
  • Other experience with programming languages such as Java, Scala and/or C#
  • Proficiency using query languages such as SQL, Hive, and Presto
  • Familiarity with BDE (Spark/pyspark, MapReduce, or Hadoop)
  • Familiarity with software development tools and platforms (Git, Linux, etc.)
  • Proven ability to drive business results with data-based insights
  • Self-initiative and an entrepreneurial mindset
  • Strong communication skills
  • Passion for data

WHAT WE OFFER:

  • Competitive Salary
  • Medical, Dental and Vision
  • 15 Days of PTO (Paid Time Off)
  • Lunch provided 2x a week 
  • Snacks, snacks, snacks!
  • Casual dress code
Share this job:
Senior Software Engineer, Data Pipeline
java scala go elasticsearch apache-spark senior Dec 31 2019

About the Opportunity

The SecurityScorecard ratings platform helps enterprises across the globe manage the cyber security posture of their vendors. Our SaaS products have created a new category of enterprise software and our culture has helped us be recognized as one of the 10 hottest SaaS startups in NY for two years in a row. Our investors include both Sequoia and Google Ventures. We are scaling quickly but are ever mindful of our people and products as we grow.

As a Senior Software Engineer on the Data Pipeline Platform team, you will help us scale, support, and build the next-generation platform for our data pipelines. The team’s mission is to empower data scientists, software engineers, data engineers, and threat intelligence engineers accelerate the ingestion of new data sources and present the data in a meaningful way to our clients.

What you will do:

Design and implement systems for ingesting, transforming, connecting, storing, and delivering data from a wide range of sources with various levels of complexity and scale.  Enable other engineers to deliver value rapidly with minimum duplication of effort. Automate the infrastructure supporting the data pipeline as code and deployments by improving CI/CD pipelines.  Monitor, troubleshoot, and improve the data platform to maintain stability and optimal performance.

Who you are:

  • Bachelor's degree or higher in a quantitative/technical field such as Computer Science, Engineering, Math
  • 6+ years of software development experience
  • Exceptional skills in at least one high-level programming language (Java, Scala, Go, Python or equivalent)
  • Strong understanding of big data technologies such as Kafka, Spark, Storm, Cassandra, Elasticsearch
  • Experience with AWS services including S3, Redshift, EMR and RDS
  • Excellent communication skills to collaborate with cross functional partners and independently drive projects and decisions

What to Expect in Our Hiring Process:

  • Phone conversation with Talent Acquisition to learn more about your experience and career objectives
  • Technical phone interview with hiring manager
  • Video or in person interviews with 1-3 engineers
  • At home technical assessment
  • Video or in person interview with engineering leadership
Share this job:
Senior Machine Learning - Series A Funded Startup
machine-learning scala python tensorflow apache-spark machine learning Dec 26 2019
About you:
  • Care deeply about democratizing access to data.  
  • Passionate about big data and are excited by seemingly-impossible challenges.
  • At least 80% of people who have worked with you put you in the top 10% of the people they have worked with.
  • You think life is too short to work with B-players.
  • You are entrepreneurial and want to work in a super fact-paced environment where the solutions aren’t already predefined.
About SafeGraph: 

  • SafeGraph is a B2B data company that sells to data scientists and machine learning engineers. 
  • SafeGraph's goal is to be the place for all information about physical Places
  • SafeGraph currently has 20+ people and has raised a $20 million Series A.  CEO previously was founder and CEO of LiveRamp (NYSE:RAMP).
  • Company is growing fast, over $10M ARR, and is currently profitable. 
  • Company is based in San Francisco but about 50% of the team is remote (all in the U.S.). We get the entire company together in the same place every month.

About the role:
  • Core software engineer.
  • Reporting to SafeGraph's CTO.
  • Work as an individual contributor.  
  • Opportunities for future leadership.

Requirements:
  • You have at least 6 years of relevant work experience.
  • Deep understanding of machine learning models, data analysis, and both supervised and unsupervised learning methods. 
  • Proficiency writing production-quality code, preferably in Scala, Java, or Python.
  • Experience working with huge data sets. 
  • You are authorized to work in the U.S.
  • Excellent communication skills.
  • You are amazingly entrepreneurial.
  • You want to help build a massive company. 
Nice to haves:
  • Experience using Apache Spark to solve production-scale problems.
  • Experience with AWS.
  • Experience with building ML models from the ground up.
  • Experience working with huge data sets.
  • Python, Database and Systems Design, Scala, TensorFlow, Apache Spark, Hadoop MapReduce.
Share this job:
VP of Engineering - Series A Funded Data Startup
scala python machine-learning apache-spark hadoop machine learning Dec 24 2019
About you:
  • High velocity superstar.
  • You want to challenge of growing and managing remote teams
  • You love really hard engineering challenges
  • You love recruiting and managing super sharp people
  • At least 80% of people who have worked with you put you in the top 10% of the people they have worked with.
  • You think life is too short to work with B-players.
  • You are entrepreneurial and want to work in a super fact-paced environment where the solutions aren’t already predefined.
  • you walk through walls 
  • you want to help build a massive company
  • you live in the United States or Canada
About SafeGraph: 

  • SafeGraph is a B2B data company that sells to data scientists and machine learning engineers. 
  • SafeGraph's goal is to be the place for all information about physical Places
  • SafeGraph currently has 20+ people and has raised a $20 million Series A.  CEO previously was founder and CEO of LiveRamp (NYSE:RAMP).
  • Company is growing fast, over $10M ARR, and is currently profitable. 
  • Company is based in San Francisco, Denver, and New York City but about 50% of the team is remote (all currently in the U.S.). We get the entire company together in the same place every month.


About the role:


  • Member of the executive team and reporting directly to the CEO.
  • Oversee all engineering and machine learning
  • Core member of the executive team 

Opportunity to be:

  • one of the first 40 people in a very fast growing company 
  • be one of the core drivers of company's success 
  • work with an amazing engineering team 
  • be on the executive team 
  • potential to take on more responsibility as company grows 
  • work with only A-Players
Share this job:
Senior Big Data Software Engineer
scala apache-spark python java hadoop big data Dec 23 2019
About you:
  • Care deeply about democratizing access to data.  
  • Passionate about big data and are excited by seemingly-impossible challenges.
  • At least 80% of people who have worked with you put you in the top 10% of the people they have worked with.
  • You think life is too short to work with B-players.
  • You are entrepreneurial and want to work in a super fact-paced environment where the solutions aren’t already predefined.
  • You live in the U.S. or Canada and are comfortable working remotely.
About SafeGraph: 

  • SafeGraph is a B2B data company that sells to data scientists and machine learning engineers. 
  • SafeGraph's goal is to be the place for all information about physical Places
  • SafeGraph currently has 20+ people and has raised a $20 million Series A.  CEO previously was founder and CEO of LiveRamp (NYSE:RAMP).
  • Company is growing fast, over $10M ARR, and is currently profitable. 
  • Company is based in San Francisco but about 50% of the team is remote (all in the U.S.). We get the entire company together in the same place every month.

About the role:
  • Core software engineer.
  • Reporting to SafeGraph's CTO.
  • Work as an individual contributor.  
  • Opportunities for future leadership.

Requirements:
  • You have at least 6 years of relevant work experience.
  • Proficiency writing production-quality code, preferably in Scala, Java, or Python.
  • Strong familiarity with map/reduce programming models.
  • Deep understanding of all things “database” - schema design, optimization, scalability, etc.
  • You are authorized to work in the U.S.
  • Excellent communication skills.
  • You are amazingly entrepreneurial.
  • You want to help build a massive company. 
Nice to haves:
  • Experience using Apache Spark to solve production-scale problems.
  • Experience with AWS.
  • Experience with building ML models from the ground up.
  • Experience working with huge data sets.
  • Python, Database and Systems Design, Scala, Data Science, Apache Spark, Hadoop MapReduce.
Share this job: