Senior Software Engineer/SRE - Ticker Plant
Bloomberg - new york city, NY
Apply NowJob Description
Who We Are Bloomberg is the global leader in business and financial data. Providing real-time and historical market data to our customers - reliably, accurately, and quickly - is at the heart of what we do, and the Ticker Plant system is the core that makes it happen. Our system processes hundreds of billions of unique market events every single day. We ingest and process events from hundreds of exchanges and thousands of other financial institutions, 24 hours a day, around the world, on millions of financial instruments across all asset classes, including stocks, bonds, commodities, currencies, and crypto. We disseminate corresponding updates to our clients in real-time, after the events have been normalized and enriched by our systems. In addition, we respond to billions of requests for current snapshot and historical data every day, retrieved from our petabytes of recorded market history, to which we add terabytes of new data to. The SRE team is central to Ticker Plant's success! We are engineers whose expertise centers on the emergent properties of a large-scale, distributed, real-time market data system. Our mission aligns with our customers' expectations, and we focus on the characteristics of the system they care about, namely: Correctness - the data a customer sees should accurately reflect the marketplace Performance - real-time latencies should be minimized; requests should be served without delay Availability - System components will fail; in a sufficiently large system, parts of it fail all the time. But the system as a whole should not fail. At the scale at which we operate, we cannot achieve these goals without sophisticated monitoring, proactive management, and automated response mechanisms. Thus, we concern ourselves with latency analysis, capacity management, cluster organization, deployment and configuration, fault tolerance, and telemetry. In addition to developing software, we also advise our partner component teams on the development of resilient software, and we analyze and fix system failures as they happen. What's in it for you: Design and develop predictive data models for our system capacity Build systems capable of early detection of issues through metrics and signals, and develop automated correction and remediation strategies Develop Python/C++ services, libraries and tools that implement our designs Proactively scale our services to stay ahead of ever-increasing market data demands by driving capacity planning, instrumentation and performance analysis Define service level objectives and apply them to drive measurable service improvement Manage entire projects, including meeting with partners, and build implementation plans Share your accomplishments at internal forums and speak at industry conferences (e.g. SRECon) We'll trust you to: Code - to read, debug, and write production-quality code. Design - write code that integrates with components across the entire system, often in collaboration with component teams. This involves assessing workflows and designing appropriate interfaces that provide consistent access to the vital functionality, and then building the applications that can perform many workflows. Analyze - SRE is concerned with the behavior of our system. We are often asked to consider the impact of potential changes prior to production or analyze causes to why the system is not behaving as expected. You'll need to have: 4+ years working with an object-oriented programming language (C/C++, Python, Java, etc.) A Degree in Computer Science, Engineering, Mathematics, similar field of study or equivalent work experience An understanding of Computer Science fundamentals such as data structures and algorithms Prior contributions to system design and architecture and scaling fault-tolerant, distributed systems Honest approach to problem-solving, and ability to collaborate with peers, partners, and management We'd love to see: Comfortable with data analysis and quantifying decision-making process Monitoring - assessing system health and performance, understanding SLIs and SLOs and alerting mechanisms Distributed systems - heterogeneity, fault tolerance, network and node failure, local inconsistencies (delays in convergence of shared state) Cluster management - clusters, deployments, staging, configuration management, A/B testing Workflow automation through orchestration Operating systems - processes, threads, and scheduling, file systems, memory management, performance tuning; knowledge of Linux or other POSIX-based system is especially useful Salary: 160000,240000,USD,Annual Bloomberg is an equal opportunity employer and we value diversity at our company. We do not discriminate on the basis of age, ancestry, color, gender identity or expression, genetic predisposition or carrier status, marital status, national or ethnic origin, race, religion or belief, sex, sexual orientation, sexual and other reproductive health decisions, parental or caring status, physical or mental disability, pregnancy or parental leave, protected veteran status, status as a victim of domestic violence, or any other classification protected by applicable law. Bloomberg is a disability inclusive employer. Please let us know if you require any reasonable adjustments to be made for the recruitment process. If you would prefer to discuss this confidentially, please email
Created: 2024-10-25