Skip to main content

Scaling Elasticsearch to Hundreds of Developer

http://gengwg.blogspot.com/
Yelp uses Elasticsearch to rapidly prototype and launch new search applications, and moving quickly at our scale raises challenges. In particular, we often encounter difficulty making changes to query logic without impacting users, as well as finding client library bugs, problems with multi-tenancy, and general reliability issues. As the number of engineers at Yelp writing new Elasticsearch queries grew, our Search Infrastructure team was having difficulty supporting the multitude of ways engineers were finding to send queries to our Elasticsearch clusters. The infrastructure we designed for a single team to communicate with a single cluster did not scale to tens of teams and tens of clusters.

Problems we Faced with Elasticsearch at Yelp

Elasticsearch is a fantastic distributed search engine, but it is also a relatively young datastore with an immature ecosystem. Until September 2013, there was no official Python client. Elasticsearch 1.0 only came out in February of 2014. Meanwhile, Yelp has been scaling search using Elasticsearch since September of 2012 and as early adopters, we have hit bumps along the way.
We have hundreds of developers working on tens of services that talk to tens of clusters. Different services use different client libraries and different clusters run different versions of Elasticsearch. Historically, this looked something like:
image02
Figure 1: Yelp Elasticsearch Infrastructure

What are the Problems?

  • Developers use many different client libraries, we have to support this.
  • We run multiple version of Elasticsearch, mostly 0.90.1, 1.0.1 and 1.2.1 clusters.
  • Multi-tenancy is often not acceptable for business critical clients because Elasticsearch cannot offer machine level resource controls and its JVM level isolation is still in development.
  • Having client code spread over a multitude of services and applications makes auditing and changing client code hard.
These problems all derive from Elasticsearch’s inevitably wide interface. Elasticsearch developers have explicitly chosen a wide interface that is hard to defend due to lack of access controls, which makes sense given the complexity they are trying to express with an HTTP interface. However, it means that treating Elasticsearch as just another service in a Service Oriented Architecture rapidly becomes difficult to maintain. The Elasticsearch API is continually evolving, sometimes in backwards incompatible ways, and the client libraries built on top of that API are continually changing as well, which ultimately means that iteration speeds suffer.

Change is Hard

As we scaled usage of Elasticsearch here at Yelp, it became harder and harder to change existing code. To illustrate these concerns let us consider two examples of developer requests on the infrastructure mentioned in Figure 1:

Convert Main Web App to use the RequestsES Client Library

This involves finding all the query code in our main web app and then, for each one:
1. Create secondary paths that use RequestsES
2. Setup RequestBucketer groups.
3. Write duplicate tests.
4. Deploy the change.
5. Remove duplicate tests.
We can make the code changes fairly easily but deploying our main web app takes a few hours and we have a lot of query code that needs to be ported. This would take significant developer time due to the amount of complexity involved in deploying our main web application. The high developer cost of changing this code outweighs the infrastructure benefits, which means this change is not pursued.

Convert Service 4 to elasticsearch-py and move them to Cluster 4

Service 4’s SLA has become stricter and they can no longer tolerate the downtime caused by Service 1’s occasionally expensive facet queries. Service 4’s developers also want the awesome reliability features that Elasticsearch 1.0 brought such as snapshot and restore. Unfortunately, our version of the YelpES client library does not support 1.X clusters, but the official Python client does, which is ok because engineers in Search Infrastructure are experts in porting YelpES code to the official Python client. Alas, we do not know anything about Service 4. This means we have to work with the team that owns Service 4, have them build parallel paths, and tell them how to communicate with the new cluster. This takes significant developer time because of coordination overhead between our teams.
It is easy to see that as the number of developers grows, these development patterns just do not scale. Developers are continually adding new query code in various services, using various client libraries, in various programming languages. Furthermore, developers are afraid to change existing code because of long deployment times and business risk. Infrastructure and operations engineers must maintain multi-tenant clusters housing clients with completely different uptime requirements and usage patterns.
Everything about this is bad. It is bad for developers, infrastructure engineers, and operations engineers, and it leads to the following lesson learned:
Systems that use Elasticsearch are more maintainable when query code is separated from business logic

Our Solution

Search Infrastructure at Yelp has been employing a proxy service we call Apollo to separate the concerns of the developers from the implementation details so that now our infrastructure looks like this:
image01
Figure 2: Apollo

Key Design Decisions

Isolate infrastructure complexity

The first and foremost purpose of Apollo is to isolate the complexity of our search infrastructure from developers. If a developer wants to search reviews from their service, they post a json blob:

{"query_text": "chicken tikka masala", "business_ids": [1, 2, 3] }

to an Apollo url:

apollo-host:1234/review/v3/search

The developer need never know that this is doing an Elasticsearch query using the elasicsearch-py client library, against an Elasticsearch cluster running in our datacenter that happens to run Elasticsearch version 1.0.1.
Validation of all incoming and exiting json objects using json-schema ensures that interfaces are respected and because these schemas ship with our client libraries we are able to check interfaces in calling code, even when that calling code is written in Python.

Make it easy to iterate on query code

Every query client is isolated in their own client module within Apollo, and each client is required to provide an input and output schema that governs what types of objects their client should accept and return. Each such interface is bound to a single implementation of a query client, which means that in order to write a non-backwards compatible interface change, one must write an entirely new client that binds to a new version of the interface. For example, if the interface to review search changes, developers write a separate module and bind it to /review/v4/search, while continuing to have the old module bound to /review/v3/search. No more “if else” experiments, just self contained modules that focus on doing one thing well.
A key feature of per module versioning is that developers can iterate on their query client independently and the Apollo service is continuously delivered, ensuring that new query code hits production in tens of minutes. Each client can also be selectively turned off or redirected to another cluster if they are causing problems in production.
As for language, we chose Python due to Yelp’s mature Python infrastructure and the ease in which consumers could quickly define simple and complicated query clients. For a high throughput service like Apollo, Python (or at least Python 2) is usually the wrong choice due to high resource usage and poor concurrency support, but by using the excellent gevent library for concurrency and the highly optimized json parsing library ujson, we were able to scale Apollo to extremely high query loads. In addition, these libraries are all drop-ins so clients do not have to design concurrency into their query logic, it comes for free. At peak load Apollo with gevent can do thousands if not tens of thousands of concurrent Elasticsearch queries on a single uwsgi worker process, which is pretty good compared to the single concurrent query that normal Python uwsgi workers can achieve.

Make it easy to iterate on infrastructure

Because the only thing that lives in Apollo is code that creates Elasticsearch queries, it is easy to port clients to new libraries or move their client to a different cluster in a matter of minutes. The interface stays the same and end to end tests ensure functionality is not broken.
Another key capability is that from the start we designed these modules to implement a simple interface that is composable. This composable-first architecture has allowed us to provide wrappers like:
  • SlowQueryLogger: A unary wrapper that logs any slow requests to a log for auditing and monitoring.
  • Tee: A binary wrapper that allows us to make requests to two clients but only wait on results from one of them. This is useful for dark launching new clients or load testing new clusters.
  • Mux: A n-ary wrapper that directs traffic between many clients. This is useful for gradual rollouts of new query code or infrastructure.
As an example, let us assume there are two query clients which differ only in the client library they use and Elasticsearch version they expect: ReviewSearchClient and OfficialReviewSearchClient. Furthermore, let us say our operations engineer has just provisioned a new shiny cluster running Elasticsearch 1.2.1 that lives in the cloud and is ready to be load tested. An example composition of these clients within Apollo might be:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
# Create base clients
PyESClient = ReviewSearchClient(query_timeout_s=0.300)
OfficialClient = OfficialReviewSearchClient(query_timeout_s=0.300)
OfficialClientCloud = OfficialReviewSearchClient(
service_name='elasticsearch-cloud',
query_timeout_s=0.300
)
# Compose base clients together
ComposedReviewSearchClient = SlowQueryLogger(
log_threshold_ms=500,
client=Tee(
toggle='toggles.reviews.enable_dark_launch',
client=Mux(
fallback=ClientWithToggle(
client=PyESClient,
toggle='toggles.reviews.use_pyes_client' # set to 0.50
),
client_configs=[
ClientWithToggle(
client=OfficialClient,
toggle='toggles.reviews.use_official_client' # set to 0.50
)
])
tee_client=OfficialClientCloud
)
)

This maps to the following request path: image00
Figure 3: Life of a Request in Apollo
In this short amount of Python code we achieved the following:
1. If any query takes longer than 500ms, log it to our slow query log for inspection
2. Send all traffic to a Mux that muxes between an old PyES implementation and our new official client implementation. We can change the Mux weights at runtime without a code push
3. Separately send traffic to a cloud cluster that we want to load test. Do not wait for the result.
Most importantly of all, we never had to worry about which consumers are making review search requests because there is a well defined interface that is well tested. Additionally, because Apollo uses Yelp’s mature Python service stack we have performance and quality metrics that can be monitored for this client, meaning that we do not have to be afraid to make these kinds of changes.

Revisiting developer requests

Now that Apollo exists, making changes goes from weeks to days, which means our organization can continue to be agile in the face of changing developer needs and backwards incompatible Elasticsearch versions. Let us revisit those developer requests now that we have Apollo:

Convert Main Web App to use the RequestsES Client Library

We have to find all the clients in Apollo that the Main Web App queries and implement their interfaces using the RequestsES client library. Then we wire up a Mux for each client that allows us to switch between the two implementations of the interface, deploy our code (~10 minutes) and gradually roll out the new code using configuration changes. From experience, query code like this can get ported in an afternoon. Having minute long deploys to production makes all the difference because it means that you can get multiple pushes to production in one day instead of one week. Also, because the elasticsearch query crafting code is separate from all the other business logic, it is easier to reason about and feel confident in changes.

Convert Service 4 to elasticsearch-py and move them to Cluster 4

We can implement Service 4’s interface using the new client library, re-using existing tests to ensure functional equivalence between the two implementations. Then we set up a Tee to the new cluster to make sure our new code works and the cluster can handle Service 4’s load. Finally, we wait a few days to ensure everything works and then we change the query client to point at the new cluster. If we really want to be safe we can setup a Mux and gradually roll it over. This whole process takes a few days or less of developer time.

Infrastructure Win

Now that Yelp engineers can leverage Apollo, along with our real time indexing system and dynamic Elasticsearch cluster provisioning, they can develop search applications faster than ever. Whereas before Search Infrastructure was accustomed to telling engineers “unfortunately we can’t do that yet”, today we have the flexibility to support even the most ambitious projects.
Since the release of Apollo just a few months ago, we have ported every major Yelp search engine running on Elasticsearch to use Apollo as well as enabled dozens of new features to be developed by other teams. Furthermore, due to the power of Apollo we were able to seamlessly upgrade to Elasticsearch 1.X for a number of our clients where prior to this that would have been nearly impossible given our uptime requirements.
As for performance, we have found that the slight overhead of running this proxy have proved more than worth it in deployment, cluster reconfiguration, and developer iteration time, enabling us to make up for the request overhead by deploying big win refactors that improve performance.
At the end of the day Apollo gives us flexibility, fast deploys, new Elasticsearch versions, performant queries, fault tolerance and isolation of complexity. A small abstraction and the right interface turns out to be a big win.

Comments

Popular posts from this blog

OWASP Top 10 Threats and Mitigations Exam - Single Select

Last updated 4 Aug 11 Course Title: OWASP Top 10 Threats and Mitigation Exam Questions - Single Select 1) Which of the following consequences is most likely to occur due to an injection attack? Spoofing Cross-site request forgery Denial of service   Correct Insecure direct object references 2) Your application is created using a language that does not support a clear distinction between code and data. Which vulnerability is most likely to occur in your application? Injection   Correct Insecure direct object references Failure to restrict URL access Insufficient transport layer protection 3) Which of the following scenarios is most likely to cause an injection attack? Unvalidated input is embedded in an instruction stream.   Correct Unvalidated input can be distinguished from valid instructions. A Web application does not validate a client’s access to a resource. A Web action performs an operation on behalf of the user without checking a shared sec

CKA Simulator Kubernetes 1.22

  https://killer.sh Pre Setup Once you've gained access to your terminal it might be wise to spend ~1 minute to setup your environment. You could set these: alias k = kubectl                         # will already be pre-configured export do = "--dry-run=client -o yaml"     # k get pod x $do export now = "--force --grace-period 0"   # k delete pod x $now Vim To make vim use 2 spaces for a tab edit ~/.vimrc to contain: set tabstop=2 set expandtab set shiftwidth=2 More setup suggestions are in the tips section .     Question 1 | Contexts Task weight: 1%   You have access to multiple clusters from your main terminal through kubectl contexts. Write all those context names into /opt/course/1/contexts . Next write a command to display the current context into /opt/course/1/context_default_kubectl.sh , the command should use kubectl . Finally write a second command doing the same thing into /opt/course/1/context_default_no_kubectl.sh , but without the use of k

标 题: 关于Daniel Guo 律师

发信人: q123452017 (水天一色), 信区: I140 标  题: 关于Daniel Guo 律师 关键字: Daniel Guo 发信站: BBS 未名空间站 (Thu Apr 26 02:11:35 2018, 美东) 这些是lz根据亲身经历在 Immigration版上发的帖以及一些关于Daniel Guo 律师的回 帖,希望大家不要被一些马甲帖广告帖所骗,慎重考虑选择律师。 WG 和Guo两家律师对比 1. fully refund的合约上的区别 wegreened家是case不过只要第二次没有file就可以fully refund。郭家是要两次case 没过才给refund,而且只要第二次pl draft好律师就可以不退任何律师费。 2. 回信速度 wegreened家一般24小时内回信。郭律师是在可以快速回复的时候才回复很快,对于需 要时间回复或者是不愿意给出确切答复的时候就回复的比较慢。 比如:lz问过郭律师他们律所在nsc区域最近eb1a的通过率,大家也知道nsc现在杀手如 云,但是郭律师过了两天只回复说让秘书update最近的case然后去网页上查,但是上面 并没有写明tsc还是nsc。 lz还问过郭律师关于准备ps (他要求的文件)的一些问题,模版上有的东西不是很清 楚,但是他一般就是把模版上的东西再copy一遍发过来。 3. 材料区别 (推荐信) 因为我只收到郭律师写的推荐信,所以可以比下两家推荐信 wegreened家推荐信写的比较长,而且每封推荐信会用不同的语气和风格,会包含lz写 的research summary里面的某个方面 郭家四封推荐信都是一个格式,一种语气,连地址,信的称呼都是一样的,怎么看四封 推荐信都是同一个人写出来的。套路基本都是第一段目的,第二段介绍推荐人,第三段 某篇或几篇文章的abstract,最后结论 4. 前期材料准备 wegreened家要按照他们的模版准备一个十几页的research summary。 郭律师在签约之前说的是只需要准备五页左右的summary,但是在lz签完约收到推荐信 ,郭律师又发来一个很长的ps要lz自己填,而且和pl的格式基本差不多。 总结下来,申请自己上心最重要。但是如果选律师,lz更倾向于wegreened,