Showing posts from 2016

Machine Learning Performance

The need for speed Coming from a real-time world (games and graphics), current machine learning training is a shock. Even small simple training tasks take a long time, which gives me plenty of time to think about the performance of my graphs. Currently most of the open source deep learning stacks consist of a C++ and Cuda back-end driven by a Python/R/Lua front end. The Cuda code tends to be fairly generic, which makes sense from an HPC and experimental point of view.  Its important to note HPC stuff is very optimised but tends to really on standard interfaces with relatively custom generic back-ends. For example BLAS is a old FORTRAN standard for linear algebra, it  has numerous back-ends included optimised x64, Cuda, OpenCL etc.  However it only accelerates the tensor multiplies in a few data formats, other more data specific optimisation like overlapping conversions isn't in its remit. Its quite a different optimisation strategy from the real-time world, which tends to be less…

A large scale archival storage system - ERC Reinenforcement

About 5 years ago, I came close to starting a tech company selling archival level storage systems. Part of the idea was an algorithm and technique I developed that efficiently stores large data reliably.

I wrote up a patent, even pushed it through the 1st phase of the UK patent system, but then decided not to go ahead. Effectively the not-patent is now open for anyone.

Its designed for large scale system, thousands of storage devices and isn't necessarily fast. However for large archival storage is saves significant storage devices over the tradition RAID systems. So if data retention is really important this is an idea that might be worth investigating...

The best example of its utility is that a 1000 storage unit system would have higher resilience and more disks used for storage than a RAID71 (the best normal system) system of similar size.

Anyway I found one of the drafts so though I'd publish it here (including typos!). I never actually published it though beyond the entr…

Flu + teaching AI to play games

Flu and strange Ideas I have a cold/flu thing at the moment and feel rotten, due to interaction with my general health when I get even a mild cold or flu I get pain everywhere and due to the levels of pain killer I take normally, I just have to grin and bare it. The way I tend to cope is to keep my mind occupied to try and not think about it. Strangely this is often a creative time in terms of random thoughts, I guess the body pushes more natural drugs into me to try and counteract the pain leading to me being a bit 'drugged'.
Thinking about teaching (not training) AIs Last night I was deep in thought about Deep AIs, as you may have noticed from my recent blog posts is something I'm really enjoying and TBH can see myself working in the field. Before the fame of the recent AlphaGo wins, Deepmind were tackling other simpler games from the Atari 2600 machine. The paper "Human-level control through deep reinforcement learning" successfully learnt to play a number of…

The greatest lie ever told: Premature Optimisation is the root of all evil!

Its a lie because it implies there is a 'premature' aspect to writing code, that you shouldn't worry about performance or power at some phase of the research/project.

Its simple not true, there are time when its not the greatest priority but thinking about how it affects these areas is never wasted.

At some fundamental level, software is something that takes power (electricity) into math. Anytime you don't do it optimally you waste power.  The only time that doesn't matter is when you have a small problem and lots of power, in practise few interesting problems are lucky to get away with that.

If you work in machine learning or big data, the likely limit to what you can do is related to how many processors (wether they are CPU, GPU or FPGA) you can throw at the problem. Assuming you can't scale infinitely, then the results you can get out of a finite set of HW will largely be determined by how performant your code is.

When you've designed your latest ANN tha…

Teaching machines to render

I've been studying AI tech a fair bit recently for a variety of reasons. There are lots of areas I want to explore using AI as solvers/approximations but as someone whom is generally employed to do graphics stuff, there always an interest in the application of AI technology to rendering. Currently except for a few papers on copying artistic styles to photos, its not yet a major discipline.
The real big thing to take aware from AI like deep neural nets, is that they are approximate function solvers that are taught via showing them the data rather than be explicitly programmed. Once trained they evaluate the inputs, through there learnt 'function' and give you a result. With enough nodes and layers in the network and a lot of training, they can give a solution to any equation you have existing data for.
In rendering, the classic Kajiya equation is the solution real and offline renderers attempt to solve. The reason why rendering takes some much compute power, is that direct…

Skinning Roger Rabbit AKA Are we there yet?

I was lucky enough to grow up in the era when home computer and consoles were new, and movies using CG was something worth talking about if anybody actually used managed to actually used it (i'm looking at you Max Headroom and Tron, both designed to look like CG but mostly used old fashioned effects as computers weren't fast enough). One of my favourite movies didn't (AFAIK) use CG but now would use it a lot, Who Framed Roger Rabbit.

Roger Rabbit is a clever comedy/who dun it about a murder with the main suspect being the eponymous Roger Rabbit. The twist is that Roger Rabbit is a rabbit, a cartoon rabbit. The world is set up so that Hollywood has a section called Toon Town, where cartoons actually exist. Bugs Bunny, Micky Mouse actually exist and act in TV shows and movies just like other actors. The star (apart from Roger) is Eddie Valiant played by Bob Hoskins who is a classic 20/30s grizzled private eye who hates Toons after one kills his partner.
The part that stands o…

Multi-frequency Shading and VR.

Multi-frequency Shading and VR.

Our peripheral vision is low resolution but high frequency, our focus vision is high resolution but slower frequency updates. This is one of the reasons 60hz isn't good enough for VR, at the edges most people can consciously see the flicker still.

Additional of course VR is stereoscopic, so requires two views of everything and low latency response, whilst you can just run everything at 90hz or even 144hz this is expensive performance and power wise.

Multi-frequency shading solves the issue by sharing where possible some calculation over time and space (view ports). Of course for this work, we need to break down the rendering into parts that are the same (or close enough) to be ran as shared.

Perhaps the oldest split has been diffuse versus specular. diffuse lighting is only dependent on light position and the surface being lit, so camera changes can be ignored. This has been used in lightmaps for a long time. For VR this means that diffuse lighting …

Frameless rendering for VR?

Frameless Rendering for VR? I'm currently doing a fair bit of thinking about rendering (nobody pays me to think about hilarious cats pics yet but one day!) and in particular VR/AR low latency rendering.
One idea that keeps popping into my head is decoupling shading rate from display rate, in a conventional render path we shade lighting etc. at the same speed with display but VR has started to change this with asynchronous time warps.
Time warps use the previous frame warped by the current VR pose to give the feeling of a faster refresh rate than the renderer actually outputs. This works because the different is only displayed for a fraction of a second (90 or 120th of a second), and the amount of change in that time is fairly restricted.
I've started to think about taking that to its logical extreme, if the display rate is fast enough (90+ FPS) do we actually need every objects to be completely up to date, all that matters is that over a few frames every object gets updated. …