Finn, what the cabbage?

Friday, May 22

Geek

A Few Of My Favourite Things

Once I got past the segfaults, anyway.  You should be using these things.

MongoDB 3.0 (not earlier versions, though)
Elasticsearch (though their approach to security is remarkably ass-backwards)
LZ4
MsgPck
Bunch
Bleach
Passlib

/images/AiPurple.jpg?size=720x&q=95

Ai Shinozaki

Posted by: Pixy Misa at 05:10 PM | Comments (10) | Add Comment | Trackbacks (Suck)
Post contains 57 words, total size 2 kb.

Wednesday, May 13

Geek

Some Pig

So, I'm tinkering with what will become Minx 1.2, and testing various stuff, and I'm pretty happy with the performance.

Then I run the numbers, and realise that I'm flooding a 10GbE connection with HTTP requests using a $15 cloud server.

I think we can count that part of the problem space as solved.

/images/AiHeadband.jpg?size=720x&q=95

Posted by: Pixy Misa at 05:30 PM | Comments (4) | Add Comment | Trackbacks (Suck)
Post contains 56 words, total size 1 kb.

Geek

Hard Things

There are only two hard things in Computer Science: cache invalidation, naming things, and off-by-one errors.

Posted by: Pixy Misa at 11:27 AM | Comments (2) | Add Comment | Trackbacks (Suck)
Post contains 18 words, total size 1 kb.

Tuesday, May 12

Geek

That'll Do

I was getting about 1000 random record reads per second.
I needed to achieve 10,000 reads per second to make things work.
I wanted to reach 100,000 reads per second to make things run nicely.
I'm currently at 1,000,000.*

That'll do.

/images/AiGreeenStripes.jpg?size=720x&q=95

* Best test run so far was ~1.6 million records per second, with some special-case optimisations.**  Without optimisations, around 300k. Per thread.

** Since you asked, the problem was with unpacking large cached records into native objects.  A common case in templates is that you only want to access one or two fields in a record - perhaps just the user's name - but unless the record is already a native object you need to load the external representation and parse it to find the field you need.  The solution was to keep an immutable version of the of the object in the process, sign it with SHA-256, and sign the matching cache entry.  Then, when we need to access the record, we can read the binary data from the cache, compare the signatures, and if they match, we're safe to continue using the existing native structure.  If they don't match, we take the payload, decrypt it (if encryption is enabled) check that the decrypted payload matches the signature (if not, something is badly wrong), uncompress the payload (if compression is enabled), parse it (MsgPack or JSON), instantiate a new object, freeze it, and put it back into the native object cache.  This can take as long as 20 microseconds.

Posted by: Pixy Misa at 05:53 PM | Comments (1) | Add Comment | Trackbacks (Suck)
Post contains 253 words, total size 2 kb.

Sunday, April 26

Geek

Needs For Speeds

Testing various libraries and patterns on Python 2.7.9 and PyPy 2.5.1

Test Python PyPy Gain
Loop 0.27 0.017 1488%
Strlist 0.217 0.056 288%
Scan 0.293 0.003 9667%
Lambda 0.093 0.002 4550%
Pystache 0.213 0.047 353%
Markdown 0.05 0.082 -39%
ToJSON 0.03 0.028 7%
FromJSON 0.047 0.028 68%
ToMsgPack 0.023 0.012 92%
FromMsgPack 0.02 0.013 54%
ToSnappy 0.027 0.032 -16%
FromSnappy 0.027 0.024 13%
ToBunch 0.18 0.016 1025%
FromBunch 0.187 0.016 1069%
CacheSet 0.067 0.046 46%
CacheGet 0.037 0.069 -46%
CacheMiss 0.017 0.015 13%
CacheFast 0.09 0.067 34%
CachePack 0.527 0.162 225%
PixyMarks 13.16 40.60 209%


Notes
  • The benchmark script runs all the tests once to warm things up, then runs them three times and takes the mean.  The PixyMark score is simply the inverse of the geometric mean of the scores.  This matters for PyPy, because it takes some time for the JIT compiler to engage.

    Tests were run on a virtual machine on what I believe to be a Xeon E3 1230, though it might be a 1225 v2 or v3.

  • The Python Markdown library is very slow. The best alternative appears to be Hoep, which is a wrapper for the Hoedown library, which is a fork of the Sundown library, which is a fork of the unfortunately named Upskirt library.   (The author of which is not a native English speaker, and probably had not previously run into the SJW crowd.)

    Hoep is slower for some reason in PyPy than CPython, but still plenty fast.

  • cPickle is an order of magnitude slower than a good JSON or MsgPack codec.

  • The built-in JSON module in CPython is the slowest Python JSON codec. The built-in JSON module in PyPy appears to be the fastest.  For CPython I used uJSON, which seems to be the best option if you're not using PyPy.

  • CPython is very good at appending to strings. PyPy, IronPython (Python for .Net) and Jython (Python for Java) are uniformly terrible at this. This is due to a clever memory allocation optimisation that is tied closely to CPython's garbage collection mechanism, and isn't available in the other implementations.

    I removed the test from my benchmark because for large strings it's so slow that it overwhelms everything else.  Instead, append to a list and join it when you're done, or something along those lines.

  • I generally see about a 6x speedup from PyPy.  In these benchmarks I've been focusing on getting the best possible speed for various functions, using C libraries wherever possible.  A C library called from Python runs at exactly the same speed as a C library called from PyPy, so this has inherently reduced the relative benefits of PyPy.  PyPy is still about 3x faster, though; in other words, migrating to PyPy effectively turns a five-year-old mid-range CPU into 8GHz next-gen unobtainium.  

  • If you are very careful about selecting your libraries.  There's an alternate Snappy compression library available.  It's about the same speed under CPython, but 30x slower under PyPy due to inefficiencies in PyPy's CTypes binding.

  • uWSGI is pretty neat.  The cache tests are run using uWSGI's cache2 module; it's the fastest caching mechanism I've seen for Python so far.  Faster than the native caching decorators I've tested - and it's shared across multiple processes.  (It can also be shared across multiple servers, but that is certain to be slower, unless you have some seriously fancy networking hardware.)

    One note, though: The uWSGI cache2 Python API is not binary-safe.  You need to JSON-encode or Base64 or something along those lines.

  • The Bleach package - a handy HTML sanitiser - is so slow that it's useless for web output - you have to sanitise on input, which means that you either lose the original text or have to store both.  Unless, that is, you have a caching mechanism with a sub-microsecond latency.

  • The Bunch package on the other hand - which lets you use object notation on Python dictionaries, so you can say customer.address rather than customer['address'] - is really fast.  I've been using it a lot recently and knew it was fast, but 1.6us to wrap a 30-element dictionary under PyPy is a pretty solid result.

  • As an aside, if you can retrieve, uncompress, unpack, and wrap a record with 30 fields in 8us, it's worth thinking about caching database records.  Except then you have to worry about cache invalidation.  Except - if you're using MongoDB, you can tail the oplog to automatically invalidate cached records.  And if you're using uWSGI, you can trivially fork that off as a worker process.

    Which means that if you have, say, a blogging platform with a template engine that frequently needs to look up related records (like the author or category for a post) this becomes easy, fast, and almost perfectly consistent.  
more...

Posted by: Pixy Misa at 01:28 PM | No Comments | Add Comment | Trackbacks (Suck)
Post contains 1403 words, total size 15 kb.

Thursday, April 23

Geek

And Now For Something Educational

Ish.

Posted by: Pixy Misa at 10:39 AM | Comments (4) | Add Comment | Trackbacks (Suck)
Post contains 6 words, total size 1 kb.

Saturday, March 28

Geek

Azunyan Update

So, Cities Skylines at 4K and max settings on a low-end notebook GPU is not a good experience.  Turn down the settings and set the resolution to 1920x1080, though, and it's not too bad.  Not silky smooth, but playable.

The new version of IntelliJ IDEA, my IDE of choice, just came out, and features HiDPI support.  Tried it on my CrazyHiDPI notebook...  It Just Works.™

It's a lot heavier than Chika, but then Chika weighs all of 2lbs, so a 15" notebook with touchscreen and dedicated GPU would be expected to weigh a little more.

Screen is glossy, which is great for colour but not great for working in a brightly lit environment. 

So far, it works, it's fast, it has 4x the RAM and 2.5x the free disk space of Chika, and I'm very happy.

Posted by: Pixy Misa at 02:27 PM | No Comments | Add Comment | Trackbacks (Suck)
Post contains 138 words, total size 1 kb.

Monday, March 23

Geek

Azunyan Est Arrivé

I checked this morning and it hadn't shipped yet.

In fact, I checked just now and it hasn't shipped yet.

But it's sitting on my desk right now.  Apparently Star Track Express use a tachyon drive.

Posted by: Pixy Misa at 03:10 PM | Comments (3) | Add Comment | Trackbacks (Suck)
Post contains 39 words, total size 1 kb.

Geek

Six Years Counts For Something

So Azunyan will be showing up tomorrow with any luck, and I was wondering just how her GPU will hold up.  It's a Radeon R7 M270, which is a slightly tweaked R7 M265, and not a very fast chip.  It's roughly in line with AMD's better integrated graphics, or Intel's very best, but with the benefit of having its own 4GB of dedicated RAM, and drivers that work.

But according to this comparison, it delivers 80% of the 3D performance of my recently retired Radeon 4850.  And 80% of the 2D performance of my current Radeon 7950, which has no trouble at all with my 4K desktop monitor.

That's better than I expected.  I played Mass Effect 1 and 2 and Dragon Age (the original, accept no other) on that 4850.  Not at the full 2560x1440 resolution of my monitor of the time, but with full graphics details otherwise.

So at 1280x720 up to maybe 1920x1080, depending on the game, it should be adequate.  I'm going to give Cities: Skylines a try at 4K just to see what it looks like; it's not a graphically taxing game but I still expect it to over-extend the M270.

Posted by: Pixy Misa at 12:38 AM | No Comments | Add Comment | Trackbacks (Suck)
Post contains 201 words, total size 1 kb.

Wednesday, March 18

Geek

Time For A Change

So Mio is nearly five years old now.  Still working just fine, but five years is a long time in notebookland, and I'm looking at getting one of these.

/images/Azunyan7000.jpg?size=500x&q=95


3GHz 5th generation Core i7 (up from a 2.13GHz 1st generation Core i3), Radeon, Radeon R7 M270 (up from a Radeon 5650), 16GB RAM (from 4GB), 4GB video RAM (from 1GB), and a 256GB SSD (up? from a 500GB HDD).

And a 3840x2160 IPS touchscreen up from a 1920x1080 TFT nontouchscreen.  Touch I don't care about so much, but 4K IPS on a 15" notebook is very nice.

Only obvious problem is that it lacks dedicated page up/down, home, and end keys.  For editing code, that's a pain.

Update: Ordered.  Azunyan inbound, ETA Tuesday next.

Posted by: Pixy Misa at 05:28 PM | Comments (6) | Add Comment | Trackbacks (Suck)
Post contains 128 words, total size 1 kb.

<< Page 1 of 122 >>
76kb generated in CPU 0.04, elapsed 0.0869 seconds.
56 queries taking 0.0515 seconds, 273 records returned.
Powered by Minx 1.1.6c-pink.