Sunday, December 29

Geek

Daily News Stuff 29 December 2019

Only Two More Shopping Days Until New Year's Edition

Tech News


New System Notes

Doing API testing on the new system today. Request routing, logins, sessions, cookies, automatic compression, all that good stuff.

The query that I was worried would slow down as the database grows - the standard user timeline - does indeed slow down as the database grows. I built in an engine to take care of that, and today I wrote the necessary query to use that engine.

Since that new query currently takes the time to build a timeline from 0.03s to 0.00s I'm adding more data to my test system to measure it again.

Oh, and you can search within your timeline. Twitter lets you do that now too, though.

Update: Stack engine vs. timeline engine:

500 timeline requests in 69.248s, avg request 139.1ms
500 stack requests in 3.963s, avg request 7.9ms


With a small database the timeline query was running fine. But if the system had taken off it would have been Fail Whale Squared. (I think this type of query caused about 90% of Twitter's problems in the early days.)

Stack requests automatically remember the last N items in your timeline so they don't have to mess around finding them again.

The other major mode is the channel request, which are used for blogs and forums and things like that. Those have no problems:

500 channel requests in 3.042s, avg request 6.1ms

That's the API request time, by the way, not just the database request, though for the timeline the database request is the overwhelming majority of the time.

I knew about this before but hadn't done the optimisation, because having a standard query let me enforce privacy checks in a single central location. Now I have three versions of the query and have to make sure the privacy checks are applied to each one.

Now I'm wondering if I can fix up that timeline query to make it run faster, because that could be really useful...

Update: Hah! Yes, that works. If I need to rebuild a user's stack, I can find the IDs of the last thousand posts that should appear in their timeline and shove them into their stack in 60 milliseconds flat. Then database queries within the stack take about 4ms.

The idea there is that if you don't log in for a while the system will stop updating your stack to save resources, but when you do log in I want it brought up to date quickly enough that you don't really notice it. 60ms is fine.

The main message query has five joins and ten subselects, which is great when the optimisation is just right because it gets everything the API needs in one go. When the optimisation is not just right, though, things go south in a hurry.

The stack works great because it means the main query never has to sort - to get the top 20 messages it just reads the top 20 stack records in index order and does five one-to-one joins.


Disclaimer: I tried to recite "How Doth the Little Busy Bee" but it came out all different.

Posted by: Pixy Misa at 10:47 PM | No Comments | Add Comment | Trackbacks (Suck)
Post contains 799 words, total size 6 kb.




Apple pies are delicious. But never mind apple pies. What colour is a green orange?




52kb generated in CPU 0.0503, elapsed 0.1098 seconds.
56 queries taking 0.1023 seconds, 338 records returned.
Powered by Minx 1.1.6c-pink.