FactSet creates flexible, open data and software solutions for over 200,000 investment professionals worldwide, providing instant access to financial data and analytics that investors use to make crucial decisions.
At FactSet, our values are the foundation of everything we do. They express how we act and operate , serve as a compass in our decision-making, and play a big role in how we treat each other, our clients, and our communities. We believe that the best ideas can come from anyone, anywhere, at any time, and that curiosity is the key to anticipating our clients’ needs and exceeding their expectations.
Your Team's Impact
This role sits within FactSet’s Cloud and Managed Services(part of Platforms and Environments department within Data Solutions) , which is responsible for delivering FactSet Bulk Data feeds to the desired Cloud Platform by Clients.
The C&MS Engineering team is responsible for the development of data pipelines that handle distribution of FactSet Content Data across various cloud platforms such as Snowflake, Databricks, AWS S3, Azure Blob and Amazon Redshift.
As part of a new initiative focused on delivering Vectorized Data as part of FactSet Intelligence Platform , the team is expanding its capabilities to support evolving business needs and the increasing demand for vectorized data. We will utilize native AI engines such as Cortex in Snowflake and Genie in Databricks to come up with product lines that help clients leverage the vectorized data into their agentic workflows.
We are looking for a Software Engineer III (Go and SQL) to join the C&MS team in Hyderabad.
You will contribute to building and evolving data delivery systems , with a strong focus on speed, delta processing, system design, and performance .
This role combines data engineering and backend development , with significant exposure to enterprise cloud platforms , large-scale datasets, SQL-heavy workflows, and system reliability challenges .
You will also contribute to the team’s ongoing transition from Perl-based systems to Python , helping modernise the technical stack.
Technical Environment : Go(primary), Python, SQL (high usage), REST APIs, Git, CI/CD pipelines
Cloud Platforms: Snowflake, Databricks, Amazon Redshift, Azure, AWS
The team: 5 software engineers based in India, 3 based in US and 2 in UK
What You'll Do
As an experienced Software Engineer with a background in data engineering, you will lead and deliver solutions that enable efficient vectorized data management and retrieval across our Snowflake and Databricks platforms. You will also mentor a growing team, collaborate with stakeholders and implement scalable data pipelines optimized for large-scale analytical and AI/ML workloads.
Contribute end-to-end design, implementation, and optimization of data pipelines for vectorized data storage, transformation, and serving.
Collaborate with cross-functional teams (data science, analytics, product) to define data requirements, architecture, and data models optimized for vector data.
Deliver high-performance data solutions using Databricks (Spark, MLflow, Delta Lake) and Snowflake, leveraging vectorized storage and retrieval mechanisms.
Develop robust ETL/ELT processes for large datasets, including data ingestion, feature extraction, and index building for vector similarity search.
Enhance existing data platforms to support vector data workloads, integrating with LLM and other AI/ML services as needed.
Ensure data governance, security, quality, and compliance with internal and external regulations.
Monitor, troubleshoot, and optimize data pipelines and processes for cost and performance.
Stay on top of emerging technologies in vector databases, cloud data platforms, and AI/ML infrastructure.
What We're Looking For
3+ years of experience in system design for large-scale, distributed systems
Good proficiency in Python, C#, python, AWS, Snowflake/Databricks and/or Go preferred programming
Experience with cloud-native application development and cloud platforms (AWS/Azure/GCP)
Hands-on experience with vector data storage, vector databases, and similarity search (e.g., ANN, Faiss, Pinecone, Milvus).
Hands-on expertise with modern datalake and cloud data platforms such as Snowflake, Databricks, Redshift, or similar
Experience in batch and/or real-time data processing applications
Experience integrating data platforms with AI/ML or LLM applications is a strong plus.
Strong written and verbal communication skills
Strong pragmatic, iterative approach to problem solving and prototyping
Experience with DevOps tools, CI/CD pipelines, and modern software development best practices
Company Overview:
FactSet ( NYSE:FDS | NASDAQ:FDS ) helps the financial community to see more, think bigger, and work better. Our digital platform and enterprise solutions deliver financial data, analytics, and open technology to more than 8,200 global clients, including over 200,000 individual users. Clients across the buy-side and sell-side, as well as wealth managers, private equity firms, and corporations, achieve more every day with our comprehensive and connected content, flexible next-generation workflow solutions, and client-centric specialized support. As a member of the S&P 500, we are committed to sustainable growth and have been recognized among the Best Places to Work in 2023 by Glassdoor as a Glassdoor Employees’ Choice Award winner. Learn more at www.factset.com and follow us on X and LinkedIn .
At FactSet, we celebrate difference of thought, experience, and perspective. Qualified applicants will be considered for employment without regard to characteristics protected by law.
Sign in to browse authentic reviews, anonymous ratings and salary data before you apply.