Remote hadoop Jobs

This Month

Remote Senior Java Developer
java sql microservices apache-kafka hadoop senior Sep 08

Boyle Software is looking for a full-time Software Engineer with Java experience. Ideally you've worked in a microservices based environment where you have gotten some understanding of working with big data tools. 

This is a remote role, but we are looking for someone within the same time zone as our team in Kyiv, Ukraine. 

There are many ways to define what qualifies an Engineer as "Senior". We don't have a year requirement in mind, we believe there is more to it than that. We do need an experienced dev who will be able to work independently on a modern application without much guidance. If that's you apply below, let's chat!

Requirements:

  • 8+ years experience in Java
  • Experience in working in Agile and CI/CD projects
  • Comfortable with Linux command line tools.
  • Experience with relational databases such as SQL Server, Oracle
  • Understanding of Big Data, Hadoop is an added advantage
  • Hands on experience with streaming technologies such as Kafka, Spark Streaming.
  • Familiarity with modern tooling such as Git, Docker and Terraform.
  • Strong SQL skills: Analytic functions, explain plans, optimization for loading and performance
  • Experience in design and implementation of multithreaded/concurrent/distributed systems.
Share this job:

This Year

Senior Software Engineer/Developer
TopDevz  
hadoop bigdata powerbi senior Jul 04

We are looking for an experienced, senior, Software Engineer/Developer, who is excited to work on one of our many client projects - both greenfield (new) projects as well as legacy (support) projects in that technology stack. This is a remote position.

Skills & Requirements

The following skills are required:

Very experienced (5+ Years) in Software/App Development.
Experienced in Power BI.
Experienced in Hadoop.
Good analytical skills, innovative and detail-oriented.
Good written and verbal communication skills.
Good problem solving skills.
Significant attention to detail when writing code, including good commenting and code documentation skills.

Share this job:
Backend Engineer Data Team
aws java apache-spark hadoop hbase backend Mar 26

Sonatype’s mission is to enable organizations to better manage their software supply chain.  We offer a series of products and services including the Nexus Repository Manager and Nexus Lifecycle Manager. We are a remote and talented product development group, and we work in small autonomous teams to create high-quality products. Thousands of organizations and millions of developers use our software. If you have a passion for challenging problems, software craftsmanship and having an impact, then Sonatype is the right place for you. We are expanding our Data team, responsible for unlocking insight from vast amounts of software component data, powering our suite of products enabling our customers from making informed and automated decisions in managing their software supply chain. As a Backend Engineer, you will lead or contribute to designing, development, and monitoring of systems and solutions for collecting, storing, processing, and analyzing large data sets.  You will work in a team made up of Data Scientists and other Software Engineers. No one is going to tell you when to get up in the morning, or dole out a bunch of small tasks for you to do every single day. Members of Sonatype's Product organization have the internal drive and initiative to make the product vision a reality. Flow should be the predominate state of mind.

Requirements:

  • Deep software engineering experience; we primarily use Java.
  • Database and data manipulation skills working with relational or non-relational models.
  • Strong ability to select and integrate appropriate tools, frameworks, systems to build great solutions.
  • Deep curiosity for how things work and desire to make them better.
  • Legally authorized to work (without sponsorship) in Canada, Colombia, or the United States of America and are currently residing in the corresponding country.

Nice To Haves:

  • Degree in Computer Science, Engineering, or another quantitative field.
  • Knowledge and experience with non-relational databases (i.e., HBase, MongoDB, Cassandra).
  • Knowledge and experience with large-scale data tools and techniques (i.e., MapReduce, Hadoop, Hive, Spark).
  • Knowledge and experience with AWS Big Data services (i.e., EMR, ElasticSearch).
  • Experience working in a highly distributed environment, using modern collaboration tools to facilitate team communication.

What We Offer:

  • The opportunity to be part of an incredible, high-growth company, working on a team of experienced colleagues
  • Competitive salary package
  • Medical/Dental/Vision benefits
  • Business casual dress
  • Flexible work schedules that ensure time for you to be you
  • 2019 Best Places to Work Washington Post and Washingtonian
  • 2019 Wealthfront Top Career Launch Company
  • EY Entrepreneur of the Year 2019
  • Fast Company Top 50 Companies for Innovators
  • Glassdoor ranking of 4.9
  • Come see why we've won all of these awards
Share this job:
Site Reliability Engineer
hadoop linux bigdata python ruby c Feb 14

The Wikimedia Foundation is hiring two Site Reliability Engineers to support and maintain (1) the data and statistics infrastructure that powers a big part of decision making in the Foundation and in the Wiki community, and (2) the search infrastructure that underpins all search on Wikipedia and its sister projects. This includes everything from eliminating boring things from your daily workflow by automating them, to upgrading a multi-petabyte Hadoop or multi-terabyte Search cluster to the next upstream version without impacting uptime and users.

We're looking for an experienced candidate who's excited about working with big data systems. Ideally you will already have some experience working with software like Hadoop, Kafka, ElasticSearch, Spark and other members of the distributed computing world. Since you'll be joining an existing team of SREs you'll have plenty of space and opportunities to get familiar with our tech (AnalyticsSearchWDQS), so there's no need to immediately have the answer to every question.

We are a full-time distributed team with no one working out of the actual Wikimedia office, so we are all together in the same remote boat. Part of the team is in Europe and part in the United States. We see each other in person two or three times a year, either during one of our off-sites (most recently in Europe), the Wikimedia All Hands (once a year), or Wikimania, the annual international conference for the Wiki community.

Here are some examples of projects we've been tackling lately that you might be involved with:

  •  Integrating an open-source GPU software platform like AMD ROCm in Hadoop and in the Tensorflow-related ecosystem
  •  Improving the security of our data by adding Kerberos authentication to the analytics Hadoop cluster and its satellite systems
  •  Scaling the Wikidata query service, a semantic query endpoint for graph databases
  •  Building the Foundation's new event data platform infrastructure
  •  Implementing alarms that alert the team of possible data loss or data corruption
  •  Building a new and improved Jupyter notebooks ecosystem for the Foundation and the community to use
  •  Building and deploying services in Kubernetes with Helm
  •  Upgrading the cluster to Hadoop 3
  •  Replacing Oozie by Airflow as a workflow scheduler

And these are our more formal requirements:

  •    Couple years experience in an SRE/Operations/DevOps role as part of a team
  •    Experience in supporting complex web applications running highly available and high traffic infrastructure based on Linux
  •    Comfortable with configuration management and orchestration tools (Puppet, Ansible, Chef, SaltStack, etc.), and modern observability       infrastructure (monitoring, metrics and logging)
  •    An appetite for the automation and streamlining of tasks
  •    Willingness to work with JVM-based systems  
  •    Comfortable with shell and scripting languages used in an SRE/Operations engineering context (e.g. Python, Go, Bash, Ruby, etc.)
  •    Good understanding of Linux/Unix fundamentals and debugging skills
  •    Strong English language skills and ability to work independently, as an effective part of a globally distributed team
  •    B.S. or M.S. in Computer Science, related field or equivalent in related work experience. Do not feel you need a degree to apply; we value hands-on experience most of all.

The Wikimedia Foundation is... 

...the nonprofit organization that hosts and operates Wikipedia and the other Wikimedia free knowledge projects. Our vision is a world in which every single human can freely share in the sum of all knowledge. We believe that everyone has the potential to contribute something to our shared knowledge, and that everyone should be able to access that knowledge, free of interference. We host the Wikimedia projects, build software experiences for reading, contributing, and sharing Wikimedia content, support the volunteer communities and partners who make Wikimedia possible, and advocate for policies that enable Wikimedia and free knowledge to thrive. The Wikimedia Foundation is a charitable, not-for-profit organization that relies on donations. We receive financial support from millions of individuals around the world, with an average donation of about $15. We also receive donations through institutional grants and gifts. The Wikimedia Foundation is a United States 501(c)(3) tax-exempt organization with offices in San Francisco, California, USA.

The Wikimedia Foundation is an equal opportunity employer, and we encourage people with a diverse range of backgrounds to apply.

U.S. Benefits & Perks*

  • Fully paid medical, dental and vision coverage for employees and their eligible families (yes, fully paid premiums!)
  • The Wellness Program provides reimbursement for mind, body and soul activities such as fitness memberships, baby sitting, continuing education and much more
  • The 401(k) retirement plan offers matched contributions at 4% of annual salary
  • Flexible and generous time off - vacation, sick and volunteer days, plus 19 paid holidays - including the last week of the year.
  • Family friendly! 100% paid new parent leave for seven weeks plus an additional five weeks for pregnancy, flexible options to phase back in after leave, fully equipped lactation room.
  • For those emergency moments - long and short term disability, life insurance (2x salary) and an employee assistance program
  • Pre-tax savings plans for health care, child care, elder care, public transportation and parking expenses
  • Telecommuting and flexible work schedules available
  • Appropriate fuel for thinking and coding (aka, a pantry full of treats) and monthly massages to help staff relax
  • Great colleagues - diverse staff and contractors speaking dozens of languages from around the world, fantastic intellectual discourse, mission-driven and intensely passionate people

*Eligible international workers' benefits are specific to their location and dependent on their employer of record

Share this job:
Data Engineer
NAVIS  
hadoop web-services python sql etl machine learning Feb 11

NAVIS is excited to be hiring a Data Engineer for a remote, US-based positionCandidates based outside of the US are not being considered at this time.  This is a NEW position due to growth in this area. 

Be a critical element of what sets NAVIS apart from everyone else!  Join the power behind the best-in-class Hospitality CRM software and services that unifies hotel reservations and marketing teams around their guest data to drive more bookings and revenue.

Our Guest Experience Platform team is seeking an experienced Data Engineer to play a lead role in the building and running of our modern big data and machine learning platform that powers our products and services. In this role, you will responsible for building the analytical data pipeline, data lake, and real-time data streaming services.  You should be passionate about technology and complex big data business challenges.

You can have a huge impact on everything from the functionality we deliver for our clients, to the architecture of our systems, to the technologies that we are adopting. 

You should be highly curious with a passion for building things!

Click here for a peek inside our Engineering Team


DUTIES & RESPONSIBILITIES:

  • Design and develop business-critical data pipelines and related back-end services
  • Identification of and participation in simplifying and addressing scalability issues for enterprise level data pipeline
  • Design and build big data infrastructure to support our data lake

QUALIFICATIONS:

  • 2+ years of extensive experience with Hadoop (or similar) Ecosystem (MapReduce, Yarn, HDFS, Hive, Spark, Presto, HBase, Parquet)
  • Experience with building, breaking, and fixing production data pipelines
  • Hands-on SQL skills and background in other data stores like SQL-Server, Postgres, and MongoDB
  • Experience with continuous delivery and automated deployments (Terraform)
  • ETL experience
  • Able to identify and participate in addressing scalability issues for enterprise level data
  • Python programming experience

DESIRED, BUT NOT REQUIRED SKILLS:

  • Experience with machine learning libraries like scikit-learn, Tensorflow, etc., or an interest in picking it up
  • Experience with R to mine structured and unstructured data and/or building statistical models
  • Experience with Elasticsearch
  • Experience with AWS services like Glue, S3, SQS, Lambda, Fargate, EC2, Athena, Kinesis, Step Functions, DynamoDB, CloudFormation and CloudWatch will be a huge plus

POSITION LOCATION:

There are 3 options for the location of this position (candidates based outside the US are NOT being considered at this time):

  • You can work remotely in the continental US with occasional travel to Bend, Oregon
  • You can be based at a shared office space in the heart of downtown Portland, Oregon
  • You can be based at our offices in Bend, Oregon (relocation assistance package available)

Check out this video to learn more about the Tech scene in Bend, Oregon


NAVIS OFFERS:

  • An inclusive, fun, values-driven company culture – we’ve won awards for it
  • A growing tech company in Bend, Oregon
  • Work / Life balance - what a concept!
  • Excellent benefits package with a Medical Expense Reimbursement Program that helps keep our medical deductibles LOW for our Team Members
  • 401(k) with generous matching component
  • Generous time off plus a VTO day to use working at your favorite charity
  • Competitive pay + annual bonus program
  • FREE TURKEYS (or pies) for every Team Member for Thanksgiving (hey, it's a tradition around here)
  • Your work makes a difference here, and we make a huge impact to our clients’ profits
  • Transparency – regular All-Team meetings, so you can stay in-the-know with what’s going on in all areas our business
Share this job:
VP, Data Science & Engineering
machine-learning hadoop data science c machine learning big data Feb 10

The Wikimedia Foundation is seeking an experienced executive to serve as Vice President of Data Science & Engineering for our Technology department. At the Wikimedia Foundation, we operate the world’s largest collaborative project: a top ten website, reaching a billion people globally every month, while incorporating the values of privacy, transparency and community that are so important to our users. 

Reporting to the Chief Technology Officer, the VP of Data Science & Engineering is a key member of the Foundation’s leadership team and an active participant in the strategic decision making framing the work of the technology department, the Wikimedia Foundation and the Wikimedia movement.

This role is responsible for planning and executing an integrated multi-year data science and engineering strategy spanning our work in artificial intelligence, machine learning, search, natural language processing and analytics. This strategy will interlock with and support the larger organization and movement strategy in service of our vision of enabling every human being to share freely in the sum of human knowledge.

Working closely with other Technology and Product teams, as well as our community of contributors and readers, you’ll lead a team of dedicated directors, engineering managers, software engineers, data engineers, and data scientists who are shaping the next generation of data usage, analysis and access across all Wikimedia projects.

Some examples of our teams work in the realm of data science and data engineering can be found on our blog, including deeper info on our work in improving edit workflows with machine learning, our use of Kafka and Hadoop or our analysis of analysis of people falling into the “Wikipedia rabbit hole”. As of late we have been thinking on how to best identify traffic anomalies that might indicate outages or, possibly, censorship.  

You are responsible for:

  • Leading the technical and engineering efforts of a global team of engineers, data scientists and managers focused on our efforts in productionizing artificial intelligence, data science, analytics, machine learning and natural language processing models as well as data operations. These efforts currently encompass three teams: Search Platform, Analytics and Scoring Platform (Machine Learning Engineering)
  • Working closely with our Research, Architecture, Security, Site Reliability and Platform teams to define our next generation of data architecture, search, machine learning and analytics infrastructure
  • Creating scalable engineering management processes and prioritization rubrics
  • Developing the strategy, plan, vision, and the cross-functional teams to create a holistic data strategy for Wikimedia Foundation taking into account our fundamental values of transparency, privacy, and collaboration and in collaboration with internal and external stakeholders and community members.
  • Ensure data is available, reliable, consistent, accessible, secure, and available in a timely manner for external and internal stakeholders and in accordance with our privacy policy.
  • Negotiating shared goals, roadmaps and dependencies with finance, product, legal and communication departments
  • Contributing to our culture by managing, coaching and developing our engineering and data teams
  • Illustrating your success in making your mark on the world by collaboratively measuring and adapting our data strategy within the technology department and the broader Foundation
  • Managing up to 5 direct reports with a total team size of 20

Skills and Experience:

  • Deep experience in leading data science, machine learning, search or data engineering teams that is able to separate the hype in the artificial intelligence space from the reality of delivering production ready data systems
  • 5+ years senior engineering leadership experience
  • Demonstrated ability to balance competing interests in a complex technical and social environment
  • Proven success at all stages of the engineering process and product lifecycle, leading to significant, measurable impact.
  • Previous hands-on experience in production big data and machine learning environments at scale
  • Experience building and supporting diverse, international and distributed teams
  • Outstanding oral and written English language communications

Qualities that are important to us:

  • You take a solutions-focused approach to challenging data and technical problems
  • A passion for people development, team culture and the management of ideas
  • You have a desire to show the world how data can be done while honoring the user’s right to privacy

Additionally, we’d love it if you have:

  • Experience with modern machine learning, search and natural language processing platforms
  • A track record of open source participation
  • Fluency or familiarity with languages in addition to English
  • Spent time having lived or worked outside your country of origin
  • Experience as a member of a volunteer community

The Wikimedia Foundation is... 

...the nonprofit organization that hosts and operates Wikipedia and the other Wikimedia free knowledge projects. Our vision is a world in which every single human can freely share in the sum of all knowledge. We believe that everyone has the potential to contribute something to our shared knowledge, and that everyone should be able to access that knowledge, free of interference. We host the Wikimedia projects, build software experiences for reading, contributing, and sharing Wikimedia content, support the volunteer communities and partners who make Wikimedia possible, and advocate for policies that enable Wikimedia and free knowledge to thrive. The Wikimedia Foundation is a charitable, not-for-profit organization that relies on donations. We receive financial support from millions of individuals around the world, with an average donation of about $15. We also receive donations through institutional grants and gifts. The Wikimedia Foundation is a United States 501(c)(3) tax-exempt organization with offices in San Francisco, California, USA.

The Wikimedia Foundation is an equal opportunity employer, and we encourage people with a diverse range of backgrounds to apply.

U.S. Benefits & Perks*

  • Fully paid medical, dental and vision coverage for employees and their eligible families (yes, fully paid premiums!)
  • The Wellness Program provides reimbursement for mind, body and soul activities such as fitness memberships, baby sitting, continuing education and much more
  • The 401(k) retirement plan offers matched contributions at 4% of annual salary
  • Flexible and generous time off - vacation, sick and volunteer days, plus 19 paid holidays - including the last week of the year.
  • Family friendly! 100% paid new parent leave for seven weeks plus an additional five weeks for pregnancy, flexible options to phase back in after leave, fully equipped lactation room.
  • For those emergency moments - long and short term disability, life insurance (2x salary) and an employee assistance program
  • Pre-tax savings plans for health care, child care, elder care, public transportation and parking expenses
  • Telecommuting and flexible work schedules available
  • Appropriate fuel for thinking and coding (aka, a pantry full of treats) and monthly massages to help staff relax
  • Great colleagues - diverse staff and contractors speaking dozens of languages from around the world, fantastic intellectual discourse, mission-driven and intensely passionate people

*Eligible non-US benefits are specific to location and dependent on employer of record

Share this job:
Data Science Course Mentor
python sql hadoop data science machine learning Jan 08

Apply here


Data Science Course Mentor

  • Mentorship
  • Remote
  • Part time


Who We Are
At Thinkful, we believe that if schools put in even half the amount of effort that students do the outcomes would be better for everyone. People would have a path to a fulfilling future, instead of being buried under debt. Employers would benefit from a workforce trained for today. And education could finally offer students a return on their investment of both money and time. 

We put in outlandish amounts of effort to create an education that offers our students a guaranteed return on their investment. we partner with employers to create a world-class curriculum built for today. We go to ends of the earth to find mentors who are the best of the best. We invest more in career services than any of our peers. We work hard to be on the ground in the cities our students are. Simply put, no other school works as hard for its students as we do. 

The Position
Students enroll in Thinkful courses to gain the valuable technical and professional skills needed to take them from curious learners to employed technologists. As a Course Mentor, you will support students by acting as an advisor, counselor, and support system as they complete the course and land their first industry job. To achieve this, you will engage with students using the below range of approaches, known as Engagement Formats. Course Mentors are expected to provide support across all formats when needed. 

  • Mentor Sessions: Meet with students 1-on-1 in online video sessions to provide technical and professional support as the student progresses through the curriculum.
  • Group Sessions: Host online video sessions on topics of your expertise (in alignment with curriculum offerings) for groups of student seeking live support between mentor sessions. 
  • Grading: Reviewing student checkpoints submissions and delivering written feedback, including analysis of projects and portfolios. 
  • Technical Coaching: Provide in-demand support to technical questions and guidance requests that come to the Technical Coaching team through text and video in a timely manner. This team also provides the TA support for immersive programs. 
  • Assessments & Mock Interviews: Conduct 1-on-1 mock interviews and assessments via video calls and provide written feedback to students based on assessment rubrics. 

In addition to working directly with students, Course Mentors are expected to maintain an environment of feedback with the Educator Experience team, and to stay on top of important updates via meetings, email, and Slack. Ideal candidates for this team are highly coachable, display genuine student advocacy, and are comfortable working in a complex, rapidly changing environment.

Requirements
  • Minimum of 3 years professional experience as a Data Scientist or demonstrated expertise with data visualizations and machine learning at an industry level
  • Proficiency in SQL, Python
  • Professional experience with Hadoop and Spark a plus
  • Excellent written and verbal communication
  • High level of empathy and people management skills
  • Must have a reliable, high-speed Internet connection

Benefits
  • This is a part-time role (10-25 hours a week)
  • Fully remote position, with the option to work evenings and weekends in person in 22 US cities
  • Community of 500+ like-minded Educators looking to impact others and keep their skills sharp
  • Full access to all of Thinkful Courses for your continued learning
  • Grow as an Educator

Apply
If you are interested in this position please provide your resume and a cover letter explaining your interest in the role.

Thinkful can only hire candidates who are eligible to work in the United States.

We stand against any form of workplace harassment based on race, color, religion, sexual orientation, gender identity or expression, national origin, age, disability, or veteran status. Thinkful provides equal employment opportunities to all employees and applicants. If you're talented and driven, please apply.

At this time, we are unable to consider applicants from the following states: Alaska, Delaware, Idaho, New Mexico, North Dakota, South Carolina, South Dakota, West Virginia, and Wyoming

Apply here
Share this job:
VP of Engineering - Series A Funded Data Startup
scala python machine-learning apache-spark hadoop machine learning Dec 24 2019
About you:
  • High velocity superstar.
  • You want to challenge of growing and managing remote teams
  • You love really hard engineering challenges
  • You love recruiting and managing super sharp people
  • At least 80% of people who have worked with you put you in the top 10% of the people they have worked with.
  • You think life is too short to work with B-players.
  • You are entrepreneurial and want to work in a super fact-paced environment where the solutions aren’t already predefined.
  • you walk through walls 
  • you want to help build a massive company
  • you live in the United States or Canada
About SafeGraph: 

  • SafeGraph is a B2B data company that sells to data scientists and machine learning engineers. 
  • SafeGraph's goal is to be the place for all information about physical Places
  • SafeGraph currently has 20+ people and has raised a $20 million Series A.  CEO previously was founder and CEO of LiveRamp (NYSE:RAMP).
  • Company is growing fast, over $10M ARR, and is currently profitable. 
  • Company is based in San Francisco, Denver, and New York City but about 50% of the team is remote (all currently in the U.S.). We get the entire company together in the same place every month.


About the role:


  • Member of the executive team and reporting directly to the CEO.
  • Oversee all engineering and machine learning
  • Core member of the executive team 

Opportunity to be:

  • one of the first 40 people in a very fast growing company 
  • be one of the core drivers of company's success 
  • work with an amazing engineering team 
  • be on the executive team 
  • potential to take on more responsibility as company grows 
  • work with only A-Players
Share this job:
Senior Big Data Software Engineer
scala apache-spark python java hadoop big data Dec 23 2019
About you:
  • Care deeply about democratizing access to data.  
  • Passionate about big data and are excited by seemingly-impossible challenges.
  • At least 80% of people who have worked with you put you in the top 10% of the people they have worked with.
  • You think life is too short to work with B-players.
  • You are entrepreneurial and want to work in a super fact-paced environment where the solutions aren’t already predefined.
  • You live in the U.S. or Canada and are comfortable working remotely.
About SafeGraph: 

  • SafeGraph is a B2B data company that sells to data scientists and machine learning engineers. 
  • SafeGraph's goal is to be the place for all information about physical Places
  • SafeGraph currently has 20+ people and has raised a $20 million Series A.  CEO previously was founder and CEO of LiveRamp (NYSE:RAMP).
  • Company is growing fast, over $10M ARR, and is currently profitable. 
  • Company is based in San Francisco but about 50% of the team is remote (all in the U.S.). We get the entire company together in the same place every month.

About the role:
  • Core software engineer.
  • Reporting to SafeGraph's CTO.
  • Work as an individual contributor.  
  • Opportunities for future leadership.

Requirements:
  • You have at least 6 years of relevant work experience.
  • Proficiency writing production-quality code, preferably in Scala, Java, or Python.
  • Strong familiarity with map/reduce programming models.
  • Deep understanding of all things “database” - schema design, optimization, scalability, etc.
  • You are authorized to work in the U.S.
  • Excellent communication skills.
  • You are amazingly entrepreneurial.
  • You want to help build a massive company. 
Nice to haves:
  • Experience using Apache Spark to solve production-scale problems.
  • Experience with AWS.
  • Experience with building ML models from the ground up.
  • Experience working with huge data sets.
  • Python, Database and Systems Design, Scala, Data Science, Apache Spark, Hadoop MapReduce.
Share this job: