Professional - Blog

Incentives, Behaviours, Culture - Post 2 - Work Hours

Originally posted,

Sharing a Cache Line with Atomic Variables - When to be Worried?

In my Engine (see Projects), for the longest time, I have adopted a practice of being aware of sharing data in a cache line where an atomic operation is also expected to occur. A great example is the common practice of using an atomic variable as a reference count for an object. This always seemed like a bad idea, as any atomic operation will likely cause a cache flush, which would mean that any nearby data operation would cause a cache miss and stall for the time to reacquire the data from the backing store. Working on the code base today, I hit this nugget, demonstrating that my instincts held up. Though it does cause some non-trivial amount of padding, I do encourage being mindful of how atomic variables are packed into structures (or global). Also, when I have some time, I will run some tests on both X64 and ARM to see if these numbers are sincere. I am hopeful :)

https://en.cppreference.com/w/cpp/thread/hardware_destructive_interference_size

__cpp_lib_hardware_interference_size = 201703
hardware_destructive_interference_size == 64
hardware_constructive_interference_size == 64
 
sizeof( OneCacheLiner ) == 64
sizeof( TwoCacheLiner ) == 128
 
oneCacheLinerThread() spent 517.83 ms
oneCacheLinerThread() spent 533.43 ms
oneCacheLinerThread() spent 527.36 ms
oneCacheLinerThread() spent 555.69 ms
oneCacheLinerThread() spent 574.74 ms
oneCacheLinerThread() spent 591.66 ms
oneCacheLinerThread() spent 555.63 ms
oneCacheLinerThread() spent 555.76 ms
Average T1 time: 550 ms
 
twoCacheLinerThread() spent 89.79 ms
twoCacheLinerThread() spent 89.94 ms
twoCacheLinerThread() spent 89.46 ms
twoCacheLinerThread() spent 90.28 ms
twoCacheLinerThread() spent 89.73 ms
twoCacheLinerThread() spent 91.11 ms
twoCacheLinerThread() spent 89.17 ms
twoCacheLinerThread() spent 90.09 ms
Average T2 time: 89 ms
 
Ratio T1/T2:~ 6.16


Incentives, Behaviours, Culture - Post 2 - Work Hours

Originally posted,

Inventives: How much work is too much?

This post continues my series on Incentives -> Behaviours -> Culture that I started last week. This example will focus on the case of the number of hours that an employee can self-select to execute outside of their regular paid time.

Summary

This post covers natural and artificial incentives for employees to work outside their paid hours. When managed well, it can be healthy for the company and the employee. In most real-world cases, it is handled in a way to drive employees to perform the maximum amount of work for the same compensation or a reward that is significantly less to the company than the fully burdened cost of an employee (i.e. the company value generation per dollar, including the bonuses, is increased dramatically through this additional work).

Suppose a company incentivizes additional work by an employee and has no process to limit the total work done longitudinally. In that case, there will be a rising tide issue. Everyone will be driven to ensure they exceed the mean/median for their role and level to reduce the risk of job loss. The result will be a constant upward pressure on expectations. These actions create a feedback loop of employees needing to do even more work to keep up. Repeat and rinse.

This cycle is how most tech companies create a crunch culture and excessive work hours while claiming to be dedicated to a work-life balance. They allow these natural incentives to drive the virtuous cycle forcing employees to do excessive work hours and manage out those who do not want to commit to the needed levels of work. Any company that does not monitor and limit the total work effort by an employee is hoping to see this cycle play out.

Read More

Incentives, Behaviours, Culture - Post 1

Originally posted,

Incentives - Behaviours - Culture

Building a Framework

I will begin a series of posts that touch on team and organizational culture, desired behaviours in that culture, and the incentives that drive the observed behaviours. This specific linkage has dominated my philosophy for organization building and remediation.

A fellow Principle Engineer at Amazon (he was in the Alexa organization and had come over from Valve) and I were discussing engineering quality, the evidence needed for a promotion, and the work that was rewarded in performance reviews when he mentioned that the resulting tension (and issues) that we were seeing was due to the incentive structure at the company. He followed it up with a comment that, as a gaming engineer, it was easy to see and understand how company incentives created perverse behaviours and undermined the desired company culture. I took a moment and thought about that statement. In creating video games, the team has to generate player incentives (some overt and others much less) to engage and move the player forward. Artists will light a set so that the player naturally focuses on level areas and places we want to draw their attention to. The lighting sequence will cause a player to move along a path set by the designers through a level. Game design systems create a reward loop to keep players engaged and improve their skills. A more extreme example of these game design loops can be seen in free-to-play games. We can create play-reward loops based on virtual goods that cause people to spend large amounts of money. Taking my experiences in these areas, I took a new look at the reward incentives at tech companies. After a few months of thinking (and reading), I developed the title mantra: incentives lead to behaviours that lead to culture. Work backwards through each step if there is an issue with your culture.

In 2019 in the middle of my MBA, the business class reviewed a similar case of incentives that built on these earlier insights. There was an area that was seeing a surge in the cobra population. The government wanted to address the issue and created a bounty on cobras to reduce their population. The result was that people started cobra farms to cash in on the bounty. The other problem this created was that escaping cobras (from the farms) resulted in a net increase in the number of snakes in the area. Therefore in business reviews, bad actors exploiting an incentive tend to be called “cobra farms.”

We will begin with an assumption that for any incentive, there will be some number of people who will look to gamify it for their benefit. We will also assume that people will, on average, attempt to maximize their gain (benefit, growth, etc.) when selecting which incentives to follow.



Outcome Based Goals in Performance Management

Originally posted,

Over the last decade, as I have been responsible for establishing or improving employee performance management frameworks, I have spent time thinking about the proper criteria and methods. In 2017, when I joined Amazon, I was impressed with the diligence that Sunny Jain (manager and VP for NA Consumables) and the entire business team spent on understanding and proving the causal links between a business input and the resulting output. In a 1:1 with Sunny, he explained that at Amazon, the business was focused on setting goals on business inputs (controllable and measurable). I heard the same message during my MBA, where the professor discussed the evolution of modern business management to focus on inputs. I took these learnings forward into my goal-setting considerations in performance management. Let’s bring this down to concrete examples (that have been sanitized and are symbolic of cases I have seen or discussed in the industry):

Read More

Language and Return on Investment

Originally posted,

Reading through an article (We’re not prepared for the end of Moore’s Law @MIT Technology Review) this morning, and reminded me of an argument that constantly came up during my time in the games industry. We would spend a lot of engineering time to get the most of the fixed hardware that we had on a console. It was imagined that in time, the power of the platform would mean that we would spend less time to maximize performance and more time on maximizing engineering time. We have definitely seen that evolution from the first days of console software writing. The choice that many people have taken to use Unity or Unreal is a great example of this type of decision. A custom purpose engine could see better performance and thus, more functionality, but the increase in sales for the needed engineering effort does not make rational sense. Anyways, quoting from the article (regarding the death of Moore’s law).

One opportunity is in slimming down so-called software bloat to wring the most out of existing chips. When chips could always be counted on to get faster and more powerful, programmers didn’t need to worry much about writing more efficient code. And they often failed to take full advantage of changes in hardware architecture, such as the multiple cores, or processors, seen in chips used today.

Thompson and his colleagues showed that they could get a computationally intensive calculation to run some 47 times faster just by switching from Python, a popular general-purpose programming language, to the more efficient C. That’s because C, while it requires more work from the programmer, greatly reduces the required number of operations, making a program run much faster. Further tailoring the code to take full advantage of a chip with 18 processing cores sped things up even more. In just 0.41 seconds, the researchers got a result that took seven hours with Python code.

Makes me wonder, if we will see a return (full circle) to this debate of engineering effort vs value per watt of CPU time/execution.



11 Years and It is Time to Post

Originally posted,

Well this is interesting - it has been over 11 years since I last put some thoughts together to make a post. Originally the blog started out as a way for me to collect thoughts and best practices that I wanted to keep in mind as I moved between jobs. Then as I moved into higher profile jobs and into management positions, I grew a little concerned about my public profile and took a step back. Once again, I feel that it is a good time to start collecting some of the learnings I have acquired over the last decade across industries and projects. I was part of the leadership team at Amazon managing Consumable retail (anything you would buy, consume, and buy again - for example toilet paper) in the middle of a pandemic, and earlier part of the leadership team responsible for shipping two video gaming titles in one of the world’s largest franchise (Call of Duty: Modern Warfare 3, and Call of Duty: Ghost). I have managed and worked with people across the globe. It has been an interesting 11 years. I have some time on my hands right now (this and future posts, like the posts below, tend to be written during down time between jobs) and I look forward to adding to this space. Some of these I’ll cross-post on LinkedIn, and others will remain tucked away on this blog for the interested reader or just my future self.



Conflict Management

Originally posted,

There is a lot to be said about this subject and I will probably update this entry a couple of times before calling it a day. There are many different methods that people advocate when dealing with conflict, some of which I will discuss later for comparison. My method is fairly simple. The first is abstraction - listen to the person who has the problem but prevent yourself from creating stories or rationalizations of what they are saying. Simply listen to the content of the message - ask clarifying questions as necessary. Avoid assumptions (this again is part of the “do not make up your own story”). Continue to ask questions until you have the whole picture. Be open to criticism - you only grow and develop when you can discover weakness and improve things that you have been doing wrong. If its not related to yourself, make sure to give yourself time to reflect on what the person has expressed - make sure they understand that you will think on what has been said. Always try to remain objective about the matter - the more emotional you are being, it is likely that the more unfair you are being to the person. Be open and honest.

Read More

Recruiters

Originally posted,

I find recruiters to be interesting. I have heard most of them describe themselves to me as external contract HR workers for multiple companies. I find that hard to reconcile with the fact that they are using their client (us) to make their wages. Is their most important job filling a opening at a company or finding the right company for their client. Is it their job to sell a position to the job seeker, or sell the job seeker to the company? I am not sure there is a good answer here but it does make dealing with recruiters a difficult proposition. Having worked with them to some degree on both sides of the fence, I was never happy with their position in either case. As an employer I started to ignore entire batches of people because of the person representing them. I found the quality of people the recruiter was generally presenting to us to be sub-par and assumed that would continue to be the case in the future. As recruiter client I have had cases where jobs were pushed on me that I did not want (and that was made clear), and multiple cases where the recruiter did nothing during the entire interview process. I guess I feel that if you are going to get a large percentage of my salary has a recruitment fee, then simple pre-interview work should be done to help to their client. Information to general practices in the interview, expectations etc. I am more than happy to do this myself, but then I have to wonder why I am bothering taking the time to work through a recruiter in the first place.

Take Away: I think recruiters should be more involved with marketing their client, than acting as an external HR department. I think they need to work with more select individuals and stop the practice of simply shot gun sending of as many people as possible to as many studios as possible.



Illusion of Not Enough Time

Originally posted,

Small break from my list of four topics but this is one of the most amusing statements I have heard it used repeatedly while working in games. Every time a manager has used this phrase on me, it takes more effort each time to prevent myself from exploding into laughter. Its almost always used to explain why action will not be taken on something that the person in charge at heart does not want done in the first place. The other possibility is a real time pressure is making it feel like there really is no time to work on something. However, in most cases that I’ve heard it used the results are that more time is wasted trying to be “fast” and avoid the proposed work than if we had simply moved forward with the original request.

The example I have heard the most often when this phrase is used as a defense is explaining why a particular process is not being followed. “There is no time” to follow XX process, and we have to just move forward and get the work done. If that phrase didn’t remind you of a meeting that made you cringe, then you are a very lucky developer. I’ve been in the room so many times for this statement that I don’t even flinch any more. My only reaction is to make sure that my schedule is clear for the inevitable crash-and-burn on the first real deliverable that uses the results of the meeting. (I suppose I should be taking vacation time instead, but I always have had a strong feeling of responsibility in work which I am involved.) It is important to always pace yourself and not to fall into the trap of “just getting it done”. Process has been created to prevent unneeded work and more importantly wasted work. This is even more important when there is a time pressure, or resources will be spent uselessly that are in high need and demand.

Take Away: If you find yourself thinking that there is “not enough time” - stop, think about why this defense is coming to mind and fix (do not avoid) the problem.



Studio Management

Originally posted,

In my last blog posting I talked about some of the issues with the explosive growth in the game industry. Now I want to talk about how to manage this growth. There is no hard and fast rule with how companies deal with organization of studios and creating management teams. The same title at different companies can have very different responsibilities and requirements. However, I have found a reoccurring problem happening from two different directions but from the same driver. There seems to be a need to have a central authority figure for all decisions in controlling a department, and in almost all cases this is usually the manager. In most cases the needs of a development team are in direct conflict with this prevalent attitude, and is usually defended by attesting that the managers are the only people who understand the whole scope of the problem.

Read More

Investment

Originally posted,

As it turns out, as I was putting together the subjects I wanted to talk about for the next few of weeks - a theme became obvious. I am going to spend the next few weeks talking about the game industry as an industry and a place of employment. The four main talking points will be: Keeping a team invested in their product and the studio, practical approaches for studio management and its impact on the decision path, the impact of recruiters with their particular bias, and finally a discussion on conflict management. Most of this stuff is not new and has most definitely been discussed in many forums, but since a blog is about talking from you own point of view I figured they were good subjects to cover. The thoughts are fresh in my head as I have recently taken a few Microsoft courses on related materials which have at least been informative and in some cases have also been instructive. The refresher in thinking about these issues is what lead me to this particular sequence of blog topics.

Read More

Movie and Game Comparison

Originally posted,

The game industry is relatively new and is still trying to establish a professional working model that works for both employer and employee in the modern working environment. It uses a studio model, similar to the movie industry but has so far eschewed (for the most part) from the feast and famine approach of that industry. Game companies tend to hire people full time and not just on a contract or per project basis - with only limited hiring flux outside of QA teams during development. Thus, in many ways the comparison between the two is not very fruitful; given that they are both media (entertainment) industries that developed in the last entry, we do find some fruit on the tree to make comparing and contrasting them an interesting exercise.

Read More

Movie and Game Collaborations

Originally posted,

Interaction between the movie and game industry comes in three ways: movie companies that try and establish gaming studios, movie auteurs who become involved in game projects and then game companies who try to establish effect / movie related studios. The first has been met with arguable successes a few times but also with definitive failures. Lucas Arts is an obvious example of the arguable success. The games they have made in their early history were significant and made large contributions to the growing gaming industry. (My favorites are Day of the Tentacle and Full Throttle.) However, past those golden years (after they restructured and let go most of there old 2D / adventure staff) things have been increasingly volatile at the company. They go through regular waves of growth and shrinkage - the most recent during this month. The games are still well received but at the same time the company has become a point of concern for people working in the industry due to the cyclical pattern of their studio employment. While Lucas Arts has been a commercial success, I would argue that due to their method of employment, we would have to temper that success with concern about it as a place of stable employment. I have heard of other attempts but since I have not heard of shipping titles, I am assuming that things did not go well for the most part. Digital Domain started a gaming studio and I have not heard of anything from them in some time. To my knowledge, Dreamworks has been involved in some projects but only as related to their own IP. Disney at least is a player at the financial level (investing in various larger groups) but directly as a corporation has not made much traction outside of their own IP (and arguable other people have done much better with the IP than Disney themselves). Overall I think the problem is that the expectations are mismanaged and there is a belief by this movie studios that there should be a significant amount of skill cross over. However, both in terms of the creative and in the technical development, there is little cross over in skill set. The result is tumultuous and generally leads to projects being terminated. I think there is potential there to be tapped but care has to be made to mange the expectations, understand the limited skill set cross over, and determine if there really is sufficient business reasons for such a lateral (and significant) shift in growth direction.

Read More

Apple Keynote

Originally posted,

I was watching the recent Apple Keynote (Oct 10, 2010) and I was rather surprised by some of the numbers. I have to admit that I have not been paying that much attention to the market, or apple hardware in general but I was astonished as to there current market share compared to their historical trend back in the days when I did pay attention (1980-2000, basically when I had the time, and was not working). Personally, I think one of the reasons that Apple hardware is doing so well in the current market is normally overlooked. Previously, the closed Apple platforms were a major issue because the rate of change in the computer industry was so very high. I won’t speak for anyone else, but its has been some number of years since I have had to upgrade any of my computers to support the new version of productivity software like I did in the “old” days. We have reached a point in hardware where the inability to upgrade (change out the mother board or plunk in a new CPU) is not a major problem. For that matter with the way that CPUs and memory are so connected now, its almost impossible to really upgrade a computer without replacing all the major components. All current generation video cards are more than capable of handling the UI requirements for the OS and normal software - so that is a non-issue. This is the time and place that Apple computers are so well suited - when people are looking for stability and security in there computer decisions rather than some ephemeral ability to upgrade that would have a marginal or no impact on their actual use of the computer. The other reasons that people talk about are well documented but I will mention quickly: customer service has been great and deservers the reputation, a large part of there increased market share is due to there strong portable offerings, and of course many people have been exposed because of the raving success of the iPod and iPhone. Many people have slowly moved over to the Apple eco system. Myself, this is being typed out on a 27” iMac, and I am/was a dedicated DIYer when it came to my computers. Its still how I put together my PCs. I do retain a PC for my development (xCode sucks hard, makes my skin crawl really), and because its the environment that I most comfortable working inside. But I find that the Mac environment is not the waste land that I thought it was back in the 90s. My first Mac was a Mac Mini server and that thing is a great piece of hardware that I use as my Perforce server and iTunes server. Power consumption on it is great, its quiet - pretty much loved the thing when I plugged it in - there is something to be said for hardware-software integration and a company that works in such a connected way.



CMake

Originally posted,

It has taken me a long time, but I have moved over to using CMake as an pseudo-build step. The basic premise of the system is that it will generate platform specific development files for the particular flavor of development environment available on that platform. For instance, it will generate xCode projects when I am working on the Mac or for the iPhone, but it will also generate Visual Studio files for when I am working on the PC. However, for obvious reasons, it must deal with many of the issues that come with having to support only the lowest common denominator in terms of basic configurations. Specifically, if you go this route, multi-platform support is not supported. So you will not be able to have a single solution for both your x64 and x86 code when working with Visual Studio. I decided that this was acceptable once I started having to work between Visual Studio and xCode as part of my weekly process to validate code across both platforms and different compilers. I could try to maintain two sets of projects, but this starts to become and incremental nightmare. For instance, some of the best ways to develop SPU code is by running Linux on the PS3 (I have retained a single PS3 retail kit at the correct version so that I can continue doing this - I wanted to buy a slim version anyway and it was a good excuse). I would of course like to be able to compile at least some of my files and systems easily for this, and that would require maintaining makefiles in parallel as well. Now we are up to three different project definition files that need to be maintained. Well that is not going to happen. The final thing that pushed me over the edge was reading someone else’s blog (forget the name actually) where he called out solution and project files as intermediate files - that there only real purpose was to allow the user to interact with their source files. Once you stop thinking of these files as part of the source, but instead just a management layer - then making the move to where they are generated from another process is easy to do.

Read More

Project Organization

Originally posted,

I find it remarkable that in most places that I have worked people have spent little time or thought about the actually file layout of their projects. Over time I have developed a rather strong preference for the file layout that I use because of the number of times that other methods have created huge headaches for me at the worst possible times. Specifically, I am a strong believer of out-of-source compilations. Let me start out with an explanation of the setup used by TGS. It pretty much follows that standard Unix model. From the root directory I have two source directories: /inc for include files and /src for the source files. Each of these contain sub-directories as necessary for each system, but only the root for each of these paths is included in the compilation settings (so /inc is part of the search path but /inc/TgS COMMON would not be). This means that users need to relative path sub-directory stored files, but I wanted to retain the clarity of where files were coming from during the compilation phase. There are two intermediate directories: /prj holds the project and solution files that are generated by CMake (see above) for interacting with the source files and /obj that contains all of the compiler intermediate files. Finally, I have two target directories: /bin to hold all executable files and /lib to hold all static libraries. Shared libraries or DLLs would be stored in the /bin directory but I currently do not use them in any of my projects. I have three non-compilation related directories: /web holds the contents of the TGS related web content for this web page, /tst holds the testing environment and its files for the unit tests for the solution and /doc for documents related to TGS and studies that I have done in regards to performance profiling to determine some of my coding decisions.

Read More

Skill Gap

Originally posted,

There are significant generation changes in programming (revolutionary) and then the more normal progress (evolutionary). The game industry is a little odd in that we can get stifled or stagnant on a particular generation of technology because of the miss match between technology change and the console generation. The transition from 2D to 3D was a major shift that many programmers were never able to meet, and there was a large shift in the industry. The move to multi core processing was more evolutionary and in many studios the need to even be aware of the requirements for working concurrently was isolated to a few programmers working at lower levels. However, it is my belief that GP GPU is going to be another major shift. The major complaint about programming on the PS3 was that to achieve maximum success it was necessary for many programmers to be able to program and use the SPUs. The next generation of consoles are going to leverage GP GPU algorithms and breaking the standard render pipe. This is going to require rethinking how GPU computational resources are used but more importantly is going to require knowledge of how to setup and use GPU jobs and tasks to get the most out of the new generation of hardware. For people who found the SPUs an issue – this will be much worse IMHO. Just as large a problem is how much of a gap there is being generated between the next generation change and the rather significant changes we’ve seen on the PC landscape in terms of GP GPU and computation in general. The skill gap that we are creating in the industry is significant. I think the companies that come out of the start of the next generation well will be the ones that enforce the culture changes required working on this hardware now (can use the SPU as a basis for skill growth).



Batch Magic

Originally posted,

I have always had only a fairly loose grasp on the DOS batch language. Whenever, I needed to do something I would spent large amounts of time either with the old DOS manuals or later on, online looking for examples. I only ever used it lightly, primarily because I rarely needed to do anything in DOS itself. However, while working at Obsidian, the Chief Technology Office (Chris Jones) wrote the initial build system in DOS batch and I ended up having to debug, maintain and extend the system. It was interesting and sometimes challenging to get the language to do what I wanted. Thankfully, the command line interpreter in NT has some really useful extensions to the standard DOS batch that making using it a little easier. For my own sanity here is an example set of batch files that I use for processing all of the files in my source tree to generate the resulting html found in this web page.

Read More

iPhone Development

Originally posted,

I don’t have any comments this week on coding suggestions. I have spent most of the time working on having my base unit tests execute and pass on the iPhone. It has been interesting, and it has proven that many of the decisions I made when making the engine have proven to be making my life fairly easy in getting this all to work. I already have a layer to support the current generation of consoles - so used to writing code that supports the three platforms it comes be a hard habit to break. Adding support for the iPhone (so far) as only required creating a few wrapper functions for some of the objective C functionality, stubbing out the vector library (pass all access to the scalar library), and implementing the bass OS platform functions. I have it at 90% pass rate right now, including all the threading tests. The lock less stuff worked pretty much right out of the box which was nice to see. I don’t have much of a take away on this one, other than correctly architected code should always be easy to support on new platforms - try not to make too many assumptions about the hardware platform or you could be in for a lot of pain when it comes to integrate the new platform into the code base. The other interesting aspect was that my life was as easy as it was because this version of the engine is Ansi-C, and I had integrated CMake into the build process so that creating new platform project/make files was handed off to that system. It worked out great - though there were a few local changes to cmake.exe I had to make for MSVC to setup everything the way that I wanted. I can’t wait to get the IO and rendering tests working now - it will be awesome to have a unit test running simultaneously on the iPhone, iPad, and my PC.



x64 Registers and Calling Convention

Originally posted,

There are two principle aspects of the x64 architecture for programmers. The obvious change is a flat 64bit memory addressing capability, but the second is a little more interesting - there are now 16 64-bit registers available on the CPU. This means that by default the MSVC compiler uses the fast call semantic for compiling 64bit programs. This convention will place the first four integer sized parameters into registers (RCX, RDX, R8, R9), floating point parameters into SIMD registers (XMM0, XMM1, XMM2, XMM3) and the remaining values on the stack. Return integer values are placed into RAX and floating point return value in XMM0. This means that 64-bit applications need potentially less stack manipulation (memory copies and marshalling into registers) than their older 32bit cousin. However, there are a few things to keep in mind - the stack pointer itself must be 16 byte aligned whenever calling a function. For example a parameter-less function needs to put the return address on the stack, but this is only an eight byte value. Thus, the stack will have to be padded by an additional eight bytes to meet the function call stack alignment requirement.



Lockless Programming and Cache Lines

Originally posted,

Having talked about the advantages of lock less programming, now there are some things that need to be taken into account when doing the data design for these systems. Primarily, one of the major factors to consider is that any interlocked/atomic commit to a variable by necessity must invalidate the cache line on which the variable is stored. Depending on the usage case for the lock, this may require isolating the variable in question from other data by either padding the structure or by keeping the lock independent of the data itself. For example:

struct { SpinLock m_Spin_Lock; int m_iData; } Bad_Layout[101];

Read More

Lockless Programming

Originally posted,

There is a lot of talk and discussion about lock less programming but the reality is often lost in confusion and lack of specificity due to nomenclature. Lockless programming is a vague term that encompasses many methods and algorithmic approaches to multi threaded programming. It is most often used to imply the use of atomic operations in lieu of the standard synchronization primitives. However, it is not specific in regards to detailing if the algorithm is lock free (a thread is guaranteed to resolve in a finite number of steps) or wait free (all threads are guaranteed to resolve in a finite number of steps). In many cases, algorithms that are described as being lock less are simply re-implementing the standard synchronization primitives in atomic code. For instance, the standard critical section by Microsoft can be configured to spin on the lock check for a defined number of iterations before sleeping (context switch out). This is no different than the many spin-locks that are custom written and integrated into a “lock-less” implementation.

My expectations when using and implementing a lock less algorithm are the following:

  1. The algorithm should be lock free (a thread is guaranteed to resolve in a finite number of steps)

  2. In most cases, I want the implementation to guarantee order of access (it is unclear if the standard primitives do this)

  3. The implementation is scalable within the expected number of concurrent executions (8-32)

  4. Implementation needs to guarantee order of read-writes of the controlled data (keep in mind that the standard primitives all integrate memory fence / barriers in their execution)

  5. Unnecessary data flushes are kept to a minimum (for instance, atomic operations will invalidate the entire cache line where the variable is stored)

  6. Spinning should be done in such a way as to free the CPU for use by other threads / hyper threads. Yielding (context switch out) should never be done except as a fail case.



Inline and Non-Inline

Originally posted,

Performance coding is always a balance - between execution speed and resources consumed. Even achieving the desired execution speed is a balance as the increase in code complexity to achieve a certain optimization but at worse defeat the desired speed increase form the change. A general method to improve execution speed is having a function inlined - compiled directly into the calling function. Let’s consider the work that is normally done when calling a function:

  1. Marshall the parameters for the function onto the stack

  2. Obtain the address of the function, and jump to the location.

  3. Construct a stack frame for local variables

  4. Perform the function execution

  5. Deconstruct the stack frame

  6. Marshall return values into the expected return locations (stack, or registers)

  7. Return to the call function

Read More

Pass In Register

Originally posted,

Well, apparently its been over three years since I last made a post. I was originally planning on trying to post something every week, but I am really bad when it comes to any type of regular correspondence. We’ll see how long I manage to keep it going this time, eh :)

So, as for the title - pass in register. This is an interesting thing that in some cases can be a very good performance gain but needs to be balanced against the number of available registers. On the PPC platforms (consoles) we have a ton of registers and so passing things by register is something that can be done regularly without too much forethought. However, on the PC the number of available registers is much more limited and care should be taken when using this execution path. Keep in mind as well that the optimizer for the PC platforms can often do a better job since most functions that use pass-in-register semantics are most likely inlined as well. Be careful trying to be smarter than the optimizer (even if you think - like me - that most optimizers are only slightly better than a five-year old when it comes to manipulating code).

Read More

Parameter Passing

Originally posted,

Reworked the entire code base so that parameters are declared using a specific syntax so as to let me toggle whether certain types (specifically vectors and matrices) are passed by value or by reference. This is so that on the console environments I can pass things by value but keep passing them by reference on the PC platform. Fun… not!

Tried installing Vista x64 since its got so much bling it must be a great platform to work on, right? I mean you have to use it if you want to play around with DX10. I’m sure people looking at the puss infected plague victims had the same thoughts, because really it was almost that bad. I couldn’t even get the system to run the x64 version of the software. What a waste of time. Guess I’ll give it the standard two year wait before even thinking of moving to the new MS OS since that is how long it seems to take them to get a decent development platform made that works on them. Bleh!

Concentrating on the physics platform for now and continuing to write constraint software. One thing left to do before moving onto that which is finalizing a working compilation again - had to change the return values for many functions, to choose like the parameters, to pass return values by value or by constant reference. Hopefully that will not take long. My development box is back to Win XP x64 so things should go smoothly tonight.



PPC Compiler

Originally posted,

I was quite proud about the way I had designed my math and collision code base using templates so that it allowed for easy flexibility between float and double computations. With the native 64bit nature of the new PPC chips this could be a very strong asset for collisions that require extra precision (quadratic surfaces for instance). Then I find out that my good friend, Mr. Compiler, insists on doing a heap shuffle on each and every parameter for which a 1:1 mapping between variable and register type does not exist. For instance the compiler will not move a vector through on 4 float registers or a matrix through on 3/4 vector registers. It will insist on doing a heap shuffle - even when inlining the code (don’t ask me - I’m just saying what I see on release-optimized asm output). This is enough for me to want to commit serious bodily harm on someone - the speed loss is ridiculous (for instance some hand tweaking of one loop in the code base changed my frame rate from 2FPS to 55FPS). There are times you just want to take the compiler out back for a few rounds, eh -) So as it stands the only way to get the needed efficiency would be to use #define network of math functions - since this would allow for the automatic transfer of matrix as vectors. What a pain in the ass, eh -( Anyways - going to play with it a bit more and see - but as far as I know this was never solved for the PS2 compiler either so I don’t have high hopes.



Xbox360 GPU functions

Originally posted,

I have been spending some time working on the Xbox360 recently working how best to use the L2 locking functionality of the hardware and the specific GPU function call in the API. Essentially they allow for greater separation between CPU and GPU execution, minimizing the number of synchronization points. This has required a rewrite of how video constants are stored and manipulated in general, keeping in mind the 64 byte alignment that is required for data transfer from the CPU to the GPU. Over all its been interesting.

Someone emailed me recently pointing out that my html parser mangled the code drop online - dropping any code after a division symbol ( the parser was interpreting it as a failed comment ). This has been fixed and so the code base should be more reasonable now. If anyone see’s any other problems, please email me!

Implemented a basic input library through XInput. Bought myself a 360 controller for windows so that I would never have to revisit Direct Input ever again. Anyone who has ever had to create a robust and thorough solution using it will understand - its a nightmare. I understand why it was designed the way it was - a PC can have any type of input device - but from a game point of view it could drive you nuts. XInput is just a slam-bam-thank-you-ma’am in-and-out affair - its wonderful.



Vectorization of a Physics Solver

Originally posted,

Been spending the last few days taking a standard physics solver setup and solution and vectorizing the resulting operations. Its been a lot of fun and will make porting it/working with it on a SPU much easier. I have also been trying to isolate small tasks to get a good to do list going for the holiday break. Finally, been working out the last remaining issues on the X360 build - which is now up and running. I threw in basic controller, audio and XMV support since its literally only takes six lines of code on the X360. Its amazing how easy the SDK for that platform is to use. One day when I feel extremely masochistic and in the need of a good sledge hammer to the brain, I’ll work on the PS3 port. It is possible hell will freeze over first. Hard to say. I did manage to survive multiple PS2 titles, so it aint all bad -)



Stick a Fork in It

Originally posted,

Whew. Finally done with LibXML2. I am 100% that I will have to revisit it in the future - but for now it compiles inside of its own namespace - correctly handles 32, 32/64, 64/64 and 64/32 native integer to address size issues. (ie. compiles on x86, x64 and PPC). I was a little worried about integrating in the Collada code after my experience with LibXML2 but it went in smooth as butter. It really highlighted the difference between the code bases - one professional and the other open source enthusiast.

I finally was able to start working on the main physics engine. The first 50% of most of the common constraints have been implemented and about 90% of the solver mechanism. What is left is to complete the constraints and to design and implement a contact graph. Should be interesting. However, once again I was side tracked as I have been working on rewriting much of the math library. As it turns out on the PPC systems, because of the large number of registers on the chip, it tends to send function parameters by register. However, if the function is not inline, and uses references it forces the system to fetch the values from memory (ack!). So, I am rewriting all the routines to pass most parameters by value. At the same time I’m expanding the vector operation library to cover more of the PPC routines that are available that were not strictly implemented on XMM. I need them to be able to restrict the solver unit to a pure vectorized system.



LibXML2 - The Continuing Saga

Originally posted,

Information Overloading:

Just do not do it - ever!

I have continued to work on the LibXML2 integration. It has taken me longer than expected specifically because I keep waffling on how and if I want to integrate this particularly library. Unfortunately, it seems to be consistently updated with bug fixes so that a heavily modified integration would make re-integration of fixes a particularly annoying task. However, as it stands the code base is a slip-shod combination of implementation confusion, and a complete disregard for possible platform issues. The first time that I saw a pointer cast in this manner [(int)(long)(pointer)] I knew that people were just not quite grasping the whole issue about address casting. Simply put they were putting their faith on the fact that the ““long”” type would match the address space - or in other words - someone slammed in the long cast to avoid a compilation warning when compiling on 64bit CPUs without actually thinking about it. The subsequent int cast is just the nail in the coffin.

Read More

LibXML2

Originally posted,

Finally got the majority of the source for LibXML2 compiling - now I’m working on getting it into its own namespace and tying it into the existing engine infrastructure. I had to remove a bunch of things from the source - things like FILE access since that is a haphazard thing on consoles in any case - I also removed all the network code since that was just something I did not see as particularly useful for me right now. Its been a project of repeatedly pounding my head against a wall - but its definitely taking form now. Hopefully I will be able to stick a damn fork into it soon and move onto the Collada support. The major goal is to be able to load and render all of the Collada samples files within as short a period of time as possible. Hopefully, without the use of too much Vodka!



Open Source Software

Originally posted,

While I am not quite an anti-evangelist for open source software, I find that whenever I look in that direction for possible solution the source itself is an illegible mess of incoherent and often badly verified code. My current project is to get a working XML processor into the main build so that I can use an XML based system for the data files, and I can then hook it into the collada system for processing of those files. I opted to go with LibXML2 since it was rated very high in terms of development and standards compliance. I did not realize at the time that making that decision was akin to deciding to go into an iron maiden because it looked like it was engineered very well using very sharp and good quality steel. Oops, my mistake! The code base has slowly been driving me nuts, having an almost haphazard approach to pointer arithmetic and variable size definitions. The common belief that sizeof(void*) == sizeof(int) is enough to drive someone trying to get code working on x64 architecture around the bend. This little side project is definitely gonna take me a couple more days to complete, so until then - adieu



Monday Curse

Originally posted,

How are you realistically supposed to have the energy or the desire to get anything done on a Monday evening? Well that was my problem last night - I sat around trying to ““decide”” what to do next - read: procrastinated doing any work. Did a few compiles and fixes for the PPC build of the system - and then moved on to integrating in some external libraries into the solution so that I could watch Heroes on the other monitor. Had to get them in eventually anyway - I needed both a basic text as well as a platform specific binary load and save process. For the text (or generic) data file I decided to go with Collada. That way I would not have to reinvent the wheel, and more importantly I would never have to delve back into the ““pile”” that is the 3D Studio Max SDK. Honestly, a few rounds with razor blades would probably be more pleasant that having to use that thing ever again in my life.

Got zlib in the solution and working. Next was libxml2 which is taking a little longer. Got most of the include path issues solved, but still have to either fix the configuration or get iConv in the solution and working. I have not decided - though I am leaning more toward iConv right now. I can see how it could be useful later for UI systems. I plan to support either UTF8 or UTF32 text streams for text output, so iConv may be useful outside of libxml2. Once its up and running I can move on to getting the Collada DOM and Implementation files up and running.



TGS Development

Originally posted,

I have spent a lot of time trying to figure out how to write a blog. I mean really, how can someone write something daily or even weekly that is remotely interesting or relevant? However, somewhere in my head I knew that I should be doing something with this space - something useful. It finally hit me what would be both useful and a way to actual write this in a relevant manner - I will document the ongoing changes and work being done on my home project (TGS), features and consideration. It may one day, by some person, be useful when they start puttering around with engine code -)

Read More

Textbooks and Concentration

Originally posted,

It did not occur to me until recently that textbook reading is really a skill that has to be trained to be useable. Its been a couple of years now since I’ve read anything longer or bigger academically than a journal or transaction paper, and now I’m sitting down to read a couple books on Voronoi Diagrams and Computational Geometry and its hurting the old brain - well at least the eye muscles. Its amazing how fast we get used to increasing short focal periods. I have spent the last few years working on multiple computers with multiple screens. I program while having a movie playing on an opposing screen. We laud our ability to multi-task and fail to realize that what we’re really saying is that we are decreasing our ability to focus on one particular task for long periods of time. I used to sit down with a text book and read it all day, stopping only for lunch. A couple months ago I found myself having to concentrate to remain focused on the material and the book after an hour. I am back to reading form now - but it made me wonder how much the skills we think are positive indicators of ability in computer programming are in reality a decrease in real functionality. Bleah!



Firefox and Writing-Mode

Originally posted,

So I’m starting to think that having a blog means writing something more than once a month or in this case more than once every 3 months. Been busy working on NWN2 so many things outside of work have kinda slid into obscurity over the last few months. I am beginning to get the web page ready for a more public and broad spectrum of people to access and found out that many of my code pages did not render at all on Firefox. I realize that text formatting is a complicated matter - I spent many years in front of DTPs (Ventura and then later PageMaker) not to know. However, the way that each browser seem to flaunt any type of conformance to any standard is enough to drive people up a wall. I am now 100% certain that browsers are made by web designers who believe in job security through obscurity. Seriously, if any language with the complete non-standard compliance that html/css has came out as a professional product for general purpose programming, it would be laughed out of the market and dropped into the dust bins of history. The one saving grace was PHP where I could create a ridiculous state list to take into account the vagrancies of the different browsers. What a pain!

Anyways - so with that done the pages should be showing in Firefox now - downloaded it just to test it and it seems to be working out fine. Testing in IE 7.0 as well, just to make sure.

Starting to think about the future, and what I want to do with my little project. The XNA announcement from Microsoft for casual developers to make X360 games has me thinking about distributing the physics/collision/scenegraph part of the engine for people to play with for people making free games. Eventually, I may put together a toolset and graphics core but that is probably much more long term.

If you have any questions about the collision code feel free to send me an email or just talkback on the blog. I will be adding explanations and commentaries on request.



MMORPGs R.I.P.

Originally posted,

This will be a talk of days long gone, a stroll down memory lane - and a simple question: can making a better game, destroy the gaming experience? It is also about the communities that are generated by and through MMORPGs. I was a PvPing rabid attack fiend in Ultima Online, a mercurial and raiding fanatic in Everquest and a solo-player in WoW. I did not really spend enough time in the other MMORPGs that I have played to establish much of an online identity. I can say with certainty that years of my life have been spent online, playing these games. Slash played in EQ was just so wrong - people should not be able to see that stuff! I remember old guild buds who were significantly over 600 days played, and when I retired was around 250 on just one of my characters.

Read More

Carmack vs Physics - First Round

Originally posted,

Talking some more on the subject of physics and games - I wanted to shoot out some things based on the Quakecon 2005 keynote by John Carmack.

But I do think it’s a mistake for people to try and go overboard and try and do a real simulation of the world because it’s a really hard problem, and you’re not going to give that much real benefit to the actual game play. You’l tend to make a game which may be fragile, may be slow, and you’d better have done some really, really neat things with your physics to make it worth all of that pain and suffering.

Read More

Physics in Computer Games

Originally posted,

Physics has been a hot topic in the game’s industry for the last couple of years. I personally think that both collision and physics will be a very important development in audience immersion, bringing games closer to a virtual environment than in previous generations. This subject has definitely encouraged a very wide range of opposing views - however, it’s the question of the usefulness of physics in a game that I want to talk about for a few words. Without question, making a game include physical components breaks a firm tradition of the game industry where the level designer had complete control of the gaming environment at all times. Physics essentially introduces a certain amount of chaos into the system that has to be taken into account during the game and level design. While it is more than possible to include physical puzzles or requirements into a game, the argument that these same events could be scripted (in some complex fashion) is correct. It is my opinion that physics is not a technology that will provide game designers with new gaming tools, but rather it’s a way to help create the suspension of disbelief necessary to lull an audience into the narrative the game is trying to express. Just like textures on graphic output as opposed to flat shading does not actually provide a new game mechanic, it does help increase the level of immersion during play.