[Rx] Using the ObserveOn and SubscribeOn operators

By jay at July 24, 2011 20:04 Tags: , , , , ,

TLDR: This post talks about how the Reactive Extensions ObserveOn operator changes the execution context (the thread) of the IObservable OnNext/OnComplete/OnError methods, whereas the SubscribeOn operator changes the execution context of the implementation of the Subscribe method in the chain of observers. Both methods can be useful to improve the performance of an application's UI by putting work on background threads.

 

When developing asynchronous code, or consuming asynchronous APIs, you find yourself forced to use specific methods on a specific thread.

The most prominent examples being WPF/Silverlight and Winforms, where you cannot use UI bound types outside of the UI Thread. In the context of WPF, you'll find yourself forced to use the Dispatcher.Invoke method to manipulate the UI in the proper context.

However, you don't want to execute everything on the UI Thread, because the UI performance relies on it. Doing too much on the UI thread can lead to a very bad percieved performance, and angry users...

 

Rx framework's ObserveOn operator

I've discussed a few times the use of ObserveOn in the context of WP7, where it is critical to leave the UI thread alone and avoid choppy animations, for instance.

The ObserveOn operator changes the context of execution (scheduler) of a chain of operators until an other operator changes it.

To be able to demonstrate this, let's write our own scheduler :

public class MyScheduler : IScheduler
{
    // Warning: ThreadStatic is not supported on Windows Phone 7.0 and 7.1
    // This code will not work properly on this platform.
    [ThreadStatic]
    public static int? SchedulerId;

    private int _schedulerId;
    private IScheduler _source;
        
    public MyScheduler(IScheduler source, int schedulerId)
    {
        _source = source;
        _schedulerId = schedulerId;
    }

    public DateTimeOffset Now { get { return _source.Now; } }

    public IDisposable Schedule<TState>(
              TState state, 
              Func<IScheduler, TState, IDisposable> action
           )
    {
        return _source.Schedule(state, WrapAction(action));
    }

    private Func<IScheduler, TState, IDisposable> WrapAction<TState>(
              Func<IScheduler, TState, IDisposable> action)
    {
        return (scheduler, state) => {

            // Set the TLS with the proper ID
            SchedulerId = _schedulerId;

            return action(_source, state);
        };
    }
}

This scheduler's purpose is to intercept calls to the ISchedule methods (You'll fill the missing Schedule methods by yourself) and flag them with a custom thread ID. That way, we'll know which scheduler is executing our code.

Note that this code will not work properly on Windows Phone 7, since ThreadStaticAttribute is not supported. And it's still not supported on 7.1... Seems like not enough people are using ThreadStatic to make its way to the WP7 CLR...

Anyway, now if we write the following Rx expression :

Observable.Timer(TimeSpan.FromSeconds(1), new MyScheduler(Scheduler.ThreadPool, 42))
          .Do(_ => Console.WriteLine(MyScheduler.SchedulerId))
          .First();

We force the timer to raise OnNext on the ThreadPool through our scheduler, and we'll get the following :

42

Which means that the lambda passed as a parameter to the Do operator got executed in the context of the Scheduler used when declaring the Timer operator.

If we go a bit farther :

Observable.Timer(TimeSpan.FromSeconds(1), new MyScheduler(Scheduler.ThreadPool, 42))
          .Do(_ => Console.WriteLine("Do(1): " + MyScheduler.SchedulerId))
          .ObserveOn(new MyScheduler(Scheduler.ThreadPool, 43))
          .Do(_ => Console.WriteLine("Do(2): " + MyScheduler.SchedulerId))
          .ObserveOn(new MyScheduler(Scheduler.ThreadPool, 44))
          .Do(_ => Console.WriteLine("Do(3): " + MyScheduler.SchedulerId))
          .Do(_ => Console.WriteLine("Do(4): " + MyScheduler.SchedulerId))
          .First();

We'll get the following :

Do(1): 42
Do(2): 43
Do(3): 44
Do(4): 44

Each time a scheduler was specified, the following operators OnNext delegates were executed on that scheduler.

In this case, we're using the Do operator which does not take a scheduler as a parameter. There some operators though, like Delay, that implicitly use a scheduler that changes the context.

Using this operator is particularly useful when the OnNext delegate is performing a context sensitive operation, like manipulating the UI, or when the source scheduler is the UI and the OnNext delegate is not related to the UI and can be executed on an other thread.

You'll find that operator handy with the WebClient or GeoCoordinateWatcher classes, which both execute their handlers on the UI thread. Watchout for Windows Phone 7.1 (mango) though, this may have changed a bit.

 

An Rx Expression's life cycle

Using an Rx expression is performed in a least 5 stages :

  • The construction of the expression,
  • The subscription to the expression,
  • The optional execution of the OnNext delegates passed as parameters (whether it be observers or explicit OnNext delegates),
  • The observer chain gets disposed either explicitly or implicitly,
  • The observers can optionally get collected by the GC.

The third part's execution context is covered by ObserveOn. But for the first two, this is different.

The expression is constructed like this : 

var o = Observable.Timer(TimeSpan.FromSeconds(1));

Almost nothing's been executed here, just the creation of the observers for the entire expression, in a similar way IEnumerable expressions work. Until you call the IEnumerator.MoveNext, nothing is performed. In Rx expressions, until the Subscribe method is called, nothing is happening.

Then you can subscribe to the expression :

var d = o.Subscribe(_ => Console.WriteLine(_));

At this point, the whole chain of operators get their Subscribe method called, meaning they can start sending OnNext/OnError/OnComplete messages.

 

The case of Observable.Return and SubscribeOn

Then you meet that kind of expressions :

Observable
   .Return(42L)
   // Merge both enumerables into one, whichever the order of appearance
   .Merge(
      Observable.Timer(
         TimeSpan.FromSeconds(1), 
         new MyScheduler(Scheduler.ThreadPool, 42)
      )
   )
   .Subscribe(_ => Console.WriteLine("Do(1): " + MyScheduler.SchedulerId));

Console.WriteLine("Subscribed !");

This expression will merge the two observables into one that will provide two values, one from Return and one from the timer.

And this is the output :

Do(1):
Subscribed !
Do(1): 42

The Observable.Return OnNext was executed during the call to Subscribe, and has that thread has no SchedulerId, meaning that a whole lot of code has been executed in the context of the caller of Subscribe. You can imagine that if that expression is complex, and that the caller is the UI Thread, that can become a performance issue.

This is where the SubscribeOn operator becomes handy :

Observable
   .Return(42L)
   // Merge both enumerables into one, whichever the order of appearance
   .Merge(
      Observable.Timer(
         TimeSpan.FromSeconds(1), 
         new MyScheduler(Scheduler.ThreadPool, 42)
      )
   )
   .SubscribeOn(new MyScheduler(Scheduler.ThreadPool, 43))
   .Subscribe(_ => Console.WriteLine("Do(1): " + MyScheduler.SchedulerId));

Console.WriteLine("Subscribed !");

You then get this :

Subscribed !
Do(1): 43
Do(1): 42

The first OnNext is now executed under of a different scheduler, making subscribe a whole lot faster from the caller's point of view.

 

Why not always Subscribe on an other thread ?

That might come in handy, but you may not want that as an opt-out because of this scenario :

Observable.FromEventPattern(textBox, "TextChanged")
          .SubscribeOn(new MyScheduler(Scheduler.ThreadPool, 43))
          .Subscribe(_ => { });

Console.WriteLine("Subscribed !");

You'd get an "Invalid cross-thread access." System.UnauthorizedAccessException, because yo would try to add an event handler to a UI element from a different thread. 

Interestingly though, this code does not work on WP7 but does on WPF 4.

An other scenario may be one where delaying the subscription may loose messages, so you need to make sure you're completely subscribed before raising events.

 

So there you have it :) I hope this helps you understand a bit better those two operators.

Team Build and Windows Phone 7

By jay at May 01, 2011 00:00 Tags: , , ,

Building Windows Phone 7 applications in an agile way encourages the use of Continuous Integration, and that can be done using Team System 2010.

There are a few pitfalls to avoid to get there, but this can be acheived quite easily with great results.

I won't cover the goodness of automated builds, this has already been covered a lot.

 

Adding Unit Tests

Along with the continous integration to create hopefully successful builds out of every check-in of source code, you'll also find the automated execution of unit tests. Team System has the ability to provide nice code coverage and unit tests success rates in Reporting Services reports, where statistics can be viewed, which gives good health indicators of the project.

Unfortunately for us, at this point there are no ways to automatically execute tests using the WP7 .NET runtime. But if you successfuly use an MVVM approach, your view models and non UI code can be compiled for multiple platforms, because they do not rely on the UI components that may be specific to the WP7 platform. That way, we are still able to test our code using the .NET 4.0 runtime with MSTest and Visual Studio 2010 test projects.

To avoid repeating code, multi-targeted files can be integrated into single target projects either by :

  • Using the Project Linker tool and link multiple projects,
  • Creating project files in the same folder and use the "include file" command when showing all files in the solution explorer. Make sure to change the output assembly name to something like "MyAssembly.Phone.dll" to avoid conflicts.

Multi-targeted files are using the #if directive and the WINDOWS_PHONE define, or the lack thereof, to compile code for the current runtime target.

There is also the option of creating projects with the Portable Library add-in, but there are some caveats on that side, and there are a few constraints when using this method. You may need to externalize code that is not supported, like UrlDecode.

Testing with MSTest ensures that your code runs successfully on .NET 4.0 runtime, but this does not test on the WP7 runtime. So to be sure that your code is successfully running on it, and since Windows Phone 7 is build on Silverlight 3, tools like the SL3 Unit Test framework can be used to manually run tests in the emulator. This cannot be integrated into the build for now, unfortunately; you'll have to place that in your QA tests.

 

TeamBuild with the WP7 toolkit

To be able to buid WP7 applications on your build machine, you need to install the SDK on your build machine, and a Team Build agent and/or controller.

Creating a build definition is done the same way as for any other build definition, except for one detail. You need to set the MSBuild platform to x86 instead of Auto if your build machine is running on a 64 Bits Windows. This forces the MSBuild runtime to use the 32 bits runtime, and not 64 bits, where the SDK does not work properly.

If you don't, when building your WP7 application, you'll find that intriguing message :

Could not load file or assembly 'System.Windows, Version=2.0.5.0'

Which is particularly odd considering that you've already installed the SDK, and that dll is definitely available.

You may also find that if you install that DLL from the SDK in the GAC, you'll get that other nice message :

Common Language Runtime detected an invalid program.

Which is most commonly found when mixing 32 bits and 64 bits assemblies, for which the architecture has been explicitly specified instead of "Any CPU". So don't install that DLL in the GAC and set the MSBuild architecture to x86.

 

That's it  for now, and Happy WP7 building !

[WP7] HttpWebRequest and the Flickr app "Black Screen" issue

By jay at April 22, 2011 14:54 Tags: , , , , ,

TL;DR: While trying to fix the "Black Screen" issue of the Windows Phone 7 flickr app 1.3, I found out that HttpWebRequest is internally making a synchronous call to the UI Thread, making a network call negatively impact the UI. The entire building of an asynchronous web query is performed on the UI thread, and you can't do anything about it.

Edit: This post was formerly named "About the UI Thread performance and HttpWebRequest", but was in fact about Yahoo's Flickr application and was enhanced accordingly.

When programming on Windows Phone 7, you'll hear often that to improve the perceived performance, you'll need to get off of the UI Thread (i.e. the dispatcher) to perform non UI related operations. By good perceived performance, I mean having the UI respond immediately, not stall when some background processing is done.

To acheive this, you'll need to use the common asynchrony techniques like queueing in the ThreadPool, create a new thread, or use the Begin/End pattern.

All of this is very true, and one very good example of bad UI Thread use is the processing of the body of a web request, particularly when using the WebClient where the raised events are in the context of the dispatcher. From a beginner's perspective, not having to care about changing contexts when developing a simple app that updates the UI, provides a particularly good and simple experience.

But that has the annoying effect of degrading the perceived performance of the application, because many parts of the application tend to run on the UI thread.

 

HttpWebRequest to the rescue ?

You'll find that the HttpWebRequest is a better choice in that regard. It uses the Begin/End pattern and the execution of the AsyncCallback is performed in the context of ThreadPool. This performs the execution of the code in that callback in a way that does not impact the perceived performance of the application.

Using the Reactive Extensions, this can be written like this :

var request = WebRequest.Create("http://www.google.com");

var queryBuilder = Observable.FromAsyncPattern(
                                (h, o) => request.BeginGetResponse(h, o),
                                ar => request.EndGetResponse(ar));

queryBuilder()
                /* Perform the expensive work in the context of the AsyncCall back */
                /* from the WebRequest. This will be the ThreadPool. */
                .Select(response => DoSomeExpensiveWork(response))

                /* Go back to the UI Thread to execute the OnNext method on the subscriber */
                .ObserveOnDispatcher()
                .Subscribe(result => DisplayResult(result));

That way, you'll get most of your code to execute out of the UI thread, where that does not impact the perceived performance of the application.

 

Why would it not be to the rescue then ?

Actually, it will always be (as of Windows Phone NoDo), but there's a catch. And that's a big deal, from a performance perspective.

Consider this code :

 public App()
 {
  /* some application initialization code */


  ManualResetEvent ev = new ManualResetEvent(false);

     ThreadPool.QueueUserWorkItem(
  d =>
  {
      var r = WebRequest.Create("http://www.google.com");
      r.BeginGetResponse((r2) => { }, null);

      ev.Set();
  }
     );

     ev.WaitOne();
 }

This code is basically beginning a request on the thread pool, while blocking the UI thread in the App.xaml.cs file. This makes the construction (but not the actual call on the network) of the WebRequest synchronous, and makes the application wait for the request to begin before showing any page to the user.

While this code is definitely not a best practice, there was a code path in the Flickr 1.3 application that was doing something remotely similar, in a more convoluted way. And if you try it for yourself, you'll find that the application hangs in a deadlock during the startup of the application, meaning that our event is never set.

 

What's happening ?

If you dig a bit, you'll find that the stack trace for a thread in the thread pool is the following :

  mscorlib.dll!System.PInvoke.PAL.Threading_Event_Wait() 
  mscorlib.dll!System.Threading.EventWaitHandle.WaitOne() 
  System.Windows.dll!System.Windows.Threading.Dispatcher.FastInvoke(...) 
  System.Windows.dll!System.Net.Browser.AsyncHelper.BeginOnUI(...)
  System.Windows.dll!System.Net.Browser.ClientHttpWebRequest.BeginGetResponse(...) 
  WindowsPhoneApplication2.dll!WindowsPhoneApplication2.App..ctor.AnonymousMethod__0(...)

The BeginGetResponse method is trying to execute something on the UI thread. And in our example, since the UI thread is blocked by the manual reset event, the application hangs in a deadlock between a resource in the dispatcher and our manual reset event.

This is also the case for the EndGetResponse method.

But if you dig even deeper, you'll find in the version of the System.Windows.dll assembly in the WP7 emulator (the one in the SDK is a stub for all public types), that the BeginGetResponse method is doing all the work of actually building the web query on the UI thread !

That is particularly disturbing. I'm still wondering why that network-only code would need to be executed to UI Thread.

 

What's the impact then ?

The impact is fairly simple : The more web requests you make, the less your UI will be responsive, both for processing the beginning and the end of a web request. Each call to the methods BeginGetResponse and EndGetResponse implicitly goes to the UI thread.

In the case of Remote Control applications like mine that are trying to have remote mouse control, all are affected by the same lagging behavior of the mouse. That's partially because the UI thread is particularly busy processing Manipulation events, this explains a lot about the performance issues of the web requests performed at the same time, even by using HttpWebRequest instead of WebClient. This also explains why until the user stops touching the screen, the web requests will be strongly slowed down.

 

The Flickr 1.3 "Black Screen" issue

In the Flickr application for which I've been able to work on, a lot of people were reporting a "black screen" issue, where the application stopped working after a few days.

The application was actually trying to update a resource from the application startup in an asynchronous fashion using the HttpWebRequest. Because of a race condition with an other lock in the application and UI Thread that was waiting in the app's initialization, this resulted in an infinite "Black Screen" that could only be bypassed by reinstalling the application.

Interestingly enough, at this point in the application's initialization, in the App's class constructor, the application is not killed after 10 seconds if it is not showing a page to the user. However, if the application stalls in the constructor of the first page, the application is automatically killed by the OS after something like 10 seconds.

Regarding the use of the UI Thread inside the HttpWebRequest code, applications that are network intensive to get a lot of small web resources like images, this is has a negative impact on the performance. The UI thread is constantly interrupted to process network resources query and responses.

 

Can I do something about it ?

During the analysis of the emulator version of the System.Windows.dll assembly, I noticed that the BeginGetResponse is checking whether the current context is the UI Thread, and does not push the execution on the dispacther.

This means that if you can group the calls to BeginGetResponse calls in the UI thread, you'll spend less time switching between contexts. That's not the panacea, but at the very least you can gain on this side.

 

What about future versions of Windows Phone ?

On the good news side, Scott Gu annouced at the Mix 11 that the manipulation events will be moved out the the UI thread, making the UI "buttery smooth" to take his words. This will a lot of applications benefit from this change.

Anyway, let's wait for Mango, I'm guessing that will this will change is a very positive way, and allow us to have high performance apps on the Windows Phone platform.

[Reactive] Being fluent with CompositeDisposable and DisposeWith

By jay at April 04, 2011 13:48 Tags: , , ,

When you're writing a few queries with the reactive extensions, you'll probably end up doing a lot of this kind of code :

moveSubscription = Observable.FromEvent<MouseEventArgs>(this, "MouseMove")
                             .Subscribe(_ => { });

clickSubscription = Observable.FromEvent<MouseEventArgs>(this, "MouseClick")
                              .Subscribe(_ => { }); 

You'll probably call the dispose method on both subscriptions in some dispose method in the class that creates the subscriptions. But this is a bit too manual for my taste.

Rx provides the CompositeDisposable class, which is basically a list of IDisposable instances that all get disposed when CompositeDisposable.Dispose() is called. So we can write it like this :

var cd = new CompositeDisposable();

var moveSubscription = Observable.FromEvent<MouseEventArgs>(this, "MouseMove")
                                 .Subscribe(_ => { });
cd.Add(moveSubscription);

var clickSubscription = Observable.FromEvent<MouseEventArgs>(this, "MouseClick")
                                  .Subscribe(_ => { });
cd.Add(clickSubscription);

That's better in a sense that it is not needed to get my subscriprions out, but this is still too "manual". There are two lines for each subscriptions, and the need for temporary variables.

That's where a simple DisposeWith extension comes in handy :

public static class DisposableExtensions
{
 public static void DisposeWith(this IDisposable observable, CompositeDisposable disposables)
 {
     disposables.Add(observable);
 }
}

Very simple extension, a bit more fluent, and that allows to avoid creating a temporary variable just to unsubscribe on a subscription:

var cd = new CompositeDisposable();

Observable.FromEvent<MouseEventArgs>(this, "MouseMove")
          .Subscribe(_ => { })
          .DisposeWith(cd);

Observable.FromEvent<MouseEventArgs>(this, "MouseClick")
          .Subscribe(_ => { })
          .DisposeWith(cd);

This is a method with a side effect, but that's acceptable in this case, and it always ends the declaration of a observable query.

[WP7Dev] Double tap when you expect only one

By jay at March 27, 2011 19:24 Tags: , , ,

I've been developing a free application to do some PC remote control on Windows Phone 7, and it's been very instructive in many ways.

To improve the quality of the software, and be notified when an unhandled exception occurs somewhere in my code, or in someone else's code executed on my behalf, I've added a small opt-in unhandled exception reporting feature. This basically sends me back information about the device, most of what's available in DeviceExtendedProperties for device aggregation of exceptions, plus some informations like the culture and, of course, the exception stacktrace and details.

 

The MarketplaceDetailTask exception

A few recurring exceptions have popped up a lot recently, and one coming often is the following :

Exception : System.InvalidOperationException: Navigation is not allowed when the task is not in the foreground. Error: -2147220989 
at Microsoft.Phone.Shell.Interop.ShellPageManagerNativeMethods.CheckHResult(Int32 hr) 
at Microsoft.Phone.Shell.Interop.ShellPageManager.NavigateToExternalPage(String pageUri, Byte[] args) 
at Microsoft.Phone.Tasks.ChooserHelper.Navigate(Uri appUri, ParameterPropertyBag ppb) 
at Microsoft.Phone.Tasks.MarketplaceLauncher.Show(MarketplaceContent content, MarketplaceOperation operation, String context) 
at Microsoft.Phone.Tasks.MarketplaceDetailTask.Show() 

This code is called when a user clicks on the purchase image located on some page of the software, and it looks like this :


void PurchaseImage_ManipulationCompleted(object sender, ManipulationCompletedEventArgs e)
{
    var details = new MarketplaceDetailTask();
    details.ContentIdentifier = "d0736804-b0f6-df11-9264-00237de2db9e";
    details.Show();
}

The call is performed directly on the image's ManipulationCompleted event.

I've been trying to reproduce it for a few times and I finally got it: The user is tapping more than once on the image.

I can see a few reasons why:

  • The ManipulationCompleted event is fairly sensitive and is raised multiple times when the user did not tap twice
  • The user did actually tap twice because the action did not answer fast enough.
  • The user tapped twice because he is used to always tap twice, as some PC users do... (You know, double clicking on hyperlinks in browsers, things like that)

 

What do we do about it ?

There may be actually more, be this actually tells me a lot.

First, I should be having some kind of visual feedback on the click of that image, to tell the user that he has done something (and also to actually follow the design guidelines)

Second, that even if the feedback is there, that there may always be two subsequent clicks, and one that may be executed after the first has called the MarketplaceDetailTask.Show(), and the application has been deactivated. I cannot do much about it, except handle the exception silently or track the actual application state and not call the method.

I'll go with the exception handler for now as it is not a very critical peace of code, but I'd rather have some way of that tell me that the application cannot do that reliably and not have to handle an exception. The API is rather limited on that side, where the PhoneApplicationService is only raising events and does not expose the current "activation" state.

 

Any other examples of exceptions ?

I'll talk more about some other findings this opt-in exception reporting feature has brought me, with some that seem to be pretty tricky. 

A bit of IT in developer's world: services.exe high CPU usage

By jay at March 24, 2011 12:36 Tags: , , ,

One of the advantages of virtualization is the P2V (Physical to Virtual) process: Converting an "old" build machine to a VM so it can be moved around with the load as-is, snapshotted, backed-up and so on.

This is particularly useful when say, you have a build machine that's been there for a very long time, has a lot of dependencies over old third party software, has been customized by so many people (that have long left the company) that if you wanted to rebuild that machine from scratch, it would literally take you weeks of tweaking to get it to work properly. And that machine is running out of very old hardware that may break at any time. And that the edition of Windows that does not migrate easily to new hardware because of HAL or Mass Storage issues, requiring a reinstallation. That a lot of "ands".

That's the kind of choice you do not need to make: You just take the machine and virtualize it using SCVMM 2008 R2.

But still, even virtualized, the machine been there that long, and things have started falling apart, like having the services.exe process taking 100% of the CPU. And I did not want to have to rebuild that machine just because of that strange behavior.

If you read scott hanselman's blog, you've been recalled that Windows Server 2008 and later has the resource monitor that gives a wealth of information about the services running under services.exe. But if you're out of luck, like running under Windows Server 2003, you can still use Process Explorer. This will give you the similar kind of insight in the Windows Services that are running.

For my particular issue, this was actually the Event Log service that was taking all the resources.

 

How about I get my CPU back ?

After some digging around, I noticed that :

  • All Event Viewer logging sections in the MMC snap-in were all displaying the same thing, which was actually a mix of all the System, Application and Security logs.
  • Displaying any of these logs was taking a huge amount of time to display.
  • The HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Eventlog\System\Source was containing something like "System System System System System System System" a hundred of times, the same thing for the keys for the other event logs
  • A whole bunches (thousands) of interesting sources named like some .NET application domain created by the application being built on this machine

To fix it, a few steps in that order :

  • Disable the Event Log service and reboot. You won't be able to stop it, but at the next reboot it will not start.
  • In the C:\WINDOWS\system32\config folder, move the files *.evt to a temporary folder, so they don't get picked up by the service when it'll restart
  • In the registry, for each HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Eventlog\[System|Security|Application]\Source, replace the content with the one found on the same key on another very similar Windows Server 2003 machine. You can install a brand new machine and pick up the content.
  • If you have, like I did, a whole bunch of sources that look familiar and should not be there under HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Eventlog\[System|Security|Application], remove their keys if you removed them from the "Source" value.
  • Set the Event Log service to "Automatic" and reboot.

 

The interesting part about the virtualization of that build machine is like in many other occasions, the snapshots, where you can make destructive changes and go back if they were actually too destructive.

 

What's with the event log "interesting sources" ?

The application being built and tested is running tests on the build machine, and it makes use of application domains and log4net. Log4net has an EventLogAppender that allows the push of specific content to the Windows Event Log. Log4net defaults the name of the source to the application domain name, if there is no entry assembly.

Those tests were actually using a default configuration, and were logging Critical messages to the event log, but the domains were created using a new GUID to avoid supposed name collisions. This is something that did actually more harm than good in the long run, because each new appdomain that was logging to the event log was creating a new event source.

And the build system has been there for a long time. Hence the thousands of "oddly named" event sources.

 

Virtual Machines, Snapshots, Automated Tests and Machine Trust Account

By jay at December 14, 2010 21:47 Tags:

During the development of a project, you may need at some point to automate the testing of your whole application. Virtual Machines make that very easy, especially with Visual Studio Lab Management in the loop.

You can have a step in your nightly build to have your application installed silently, and then have some acceptance tests running on it.

To make all this repeatable and mostly predictable, you can use snapshots of a clean environment and restore that environment before starting your test scenarios. That way, any previous test run does not affect the new run, as long as the tests can run confined in a single machine. For multiple machines tests, like a web front end and DB backend, you may need to synchronize all of the snapshots, but this is definitely doeable.

The case of the VM part of a Domain

Your tests may also need to have a domain account to run. To do this, you join your VM to the domain, then make a snapshot.

Your automated tests run fine for about a month, then start to fail for reasons like :

The trust relationship between this workstation and the primary domain failed.
 
Or something a bit different like :
 
This computer could not authenticate with \\MYDC, a Windows domain controller for domain MYDOMAIN, and therefore this computer might deny logon requests.
 
Which means that the machine account registered with the domain is out of sync.
 
For a bit of background, the machine account is used by windows subsystem, but also the "NT AUTHORITY\System" account to communicate with the domain to apply GPOs, for instance. This account is also used for many other things, like in SCVMM, to ensure that the server has access to a host without requiring a "real" account that has administrative access to that host.
 

Password Renewal

 

That machine trust account has a password like a normal user account, even though it is not accessible to the users. That password is changed periodically, every 30 days or so, and that change is initiated by a request of the machine tied to this account. It is designed in such a way that it allows machines to be offline for long periods and not get out of sync because the DC would have changed the password unilaterally.
 
At this point, you're probably seeing why that account information can be out of sync when using VMs and reverting to snapshots.
 
When joining the domain, the account is in sync, and it is possible to revert to that snapshot until that 30 days window is reached. Then, the live snapshot of the machine asks for password renewal of the trust account, and then you revert to your original snapshot. This is where the problem occurs, the machine cannot authenticate on the domain anymore because it is using the previous password.

 

Keeping the password in sync

 

There are a few techniques to keep that password in sync, the first being a leave and join domain sequence. This is fairly easy to do, but there are some caveats.
 
When you leave and re-join, you get a new SID for that machine, which means that if you gave access rights to the previous machine account on an other machine, you're forced to update your ACLs on that other machine.
 
But there is a lesser known technique based on the netdom command, to reset that trust account password : 
 
netdom resetpwd /server:MYDC /userd:MYDOMAIN\myuser /passwordD:* /securepasswordprompt​

This needs to be run on the target machine, using an account that has the rights to update machine accounts, which is most probably the same account you used to add your machine to the domain.

Making it permanent

 

You still don't want to have to reset that password every two weeks, so you can use the DisablePasswordChange registry setting for that, which will disable the update of the machine account password.
This will make your VM a bit more vulnerable to password based attacks to hijack your machine account, just so you know.

[WP7Dev][Reactive] Safer Reactive Extensions

By jay at September 06, 2010 20:26 Tags: , , , ,

Cet article est disponible en français.

When developing .NET applications, unhandled exception in threads have the undesirable effect of terminating the current process.

In the following example :

    static void Main(string[] args)
    {
        var t = new Thread(ThreadMethod);
        t.Start();

        Console.ReadLine();
    }

    private static void ThreadMethod()
    {
        Thread.Sleep(1000); throw new Exception();
    }

The basic exception will invariably terminate the process, and to prevent this, the exception needs to be handled properly :

    private static void ThreadMethod()
    {
        try
        {
            Thread.Sleep(1000); throw new Exception();
        }
        catch (Exception e)
        {
            // TODO: Log and report the exception
        }
    }

This makes classes like System.Threading.Thread, System.Threading.Timer or System.Threading.ThreadPool very dangerous to use if one wants to have an always running application. It is then required that no unhandled exception gets out of the custom handlers for these classes.

Even if it is possible to be notified when an exception has been raised and not handled properly, using the AppDomain.UnhandledException event, most the time this leads to the application being terminated. This termination behavior has been introduced in .NET 2.0, to prevent unhandled exception to be silently ignored.

While this is a very appropriate default behavior, in an enterprise environment, I’m usually enforcing custom static analysis or NDepend rules to prevent the use of these classes directly. This forces new code to use wrappers that provide a very wide exception handler and logs and reports the exception, but does not terminate the process. That also implies that there is still a very valid bug to be investigated, because exceptions should not be handled that late.

 

The case of the Reactive Framework

In Silverlight for Windows Phone 7, and in any other .NET 3.5 or .NET 4.0 application that uses the Reactive Extensions, it is very easy to switch between threads.

Reactive operators like Timer, BufferWithTime, ObserveOn or SubscribeOn allow for specific Schedulers like ThreadPool, TaskPool or NewThread to be used, and if a subscriber does not handle exceptions properly, it ends up with a terminated application.

The same exemple here also terminates the application :


    static void Main(string[] args)
    {
     Observable.Timer(TimeSpan.FromSeconds(10), TimeSpan.FromSeconds(10))
                   .Subscribe(_=> ThreadMethod());

            Console.ReadLine();
    }

    private static void ThreadMethod()
    {
            throw new Exception();
    }

The Observable.Timer operator uses the System.Threading.Timer class and that makes it vulnerable to the same termination problems. Every subscriber needs to handle exceptions thrown in the OnNext delegate, or the application will terminate.

Also, do not think that the OnError delegate passed to Observable.Subscribe will handle exceptions thrown during the execution of OnNext code. OnError only notifies of errors generated by previous Reactive operators, not the current.

 

The IScheduler.AsSafe() extension method

Unfortunately, it is not possible for now to override the default schedulers used internally by the Reactive operators. The only way to handle all unhandled exceptions properly is to use the ObserveOn operator and intercept calls to IScheduler.Schedule methods. Calls can then be decorated with appropriate exception handlers to log and report the exception without terminating the process.

So, to be able to generalize this logging and reporting behavior, I created the AsSafe() extension that I place at the very top of a Reactive expression :

    Observable.Timer(TimeSpan.FromSeconds(10), TimeSpan.FromSeconds(10))
              .ObserveOn(Scheduler.ThreadPool.AsSafe())
              .Subscribe(_=> ThreadMethod());


And here is the code of this very simple extension method :


public static class SafeSchedulerExtensions
{
    public static IScheduler AsSafe(this IScheduler scheduler)
    {
        return new SafeScheduler(scheduler);
    }

    private class SafeScheduler : IScheduler
    {
        private IScheduler _source;

        public SafeScheduler(IScheduler scheduler) {
            this._source = scheduler;
        }

        public DateTimeOffset Now { get { return _source.Now; } }

        public IDisposable Schedule(Action action, TimeSpan dueTime)
        {
            return _source.Schedule(Wrap(action), dueTime);
        }

        public IDisposable Schedule(Action action)
        {
            return _source.Schedule(Wrap(action));
        }

        private Action Wrap(Action action)
        {
            return () => {
                try  {
                    action();
                }
                catch (Exception e) {
                    // Log and report the exception.
                }
            };

        }
    }
}

[WP7] Using an Exchange Account With a Custom Certificate

By jay at July 31, 2010 15:07 Tags: ,

Depending on which corporation you work for, you may have to connect to your exchange server using a self-signed server certificate to be used with HTTPS protocol (using either TLS or SSL).

If you're unlucky enough to be in this situation, but are using a modern browser, you can install the certificate in either your windows certificate store, or using your browser's store. You can do that using this lengthy technique for IE8.

But if you're on a Windows Phone 7, if you try to connect to your exchange account, you'll get a nice message telling you that there is a problem with the server certificate. Well, neither Internet Explorer or the bundled Exchange tools give you the ability to install that custom certificate. And there is no access to the file system either.

Luckily, you can email your certificate on your GMail account for instance, and the WP7 mail client has the ability to install certificates !

So, use the lengty technique to export your certificate in the ".cer" format by connecting to your exchange server using its HTTPS address in Internet Explorer on your PC, email it to yourself, and tap on it on your Windows Phone 7 to install it.

Now you can enjoy having your work emails and calendars on your weekends, in case you don't have anything else better to do :)

 

EDIT: If it still does not work, you may need to also import the full chain of certificates, up to the root. To do so, in the certificate from your exchange server, open the "Certification Path" tab, then for each item in the tree, click "View Certificate", then "Details", then "Copy to File...". Email each certificate to your Windows Phone and you're done !

Using the Remote Debugger

By jay at July 22, 2010 20:05 Tags: , ,

Cet article est disponible en francais.

To continue in the same kind of articles about Visual Studio features that have been available for a while now, but are commonly under-used, I'll talk in this post about the Remote Debugger.

 

Local Debugging

Visual Studio has a debugger that allows the debugging of program when running it using F5, or "Debug / Start Debugging". Visual Studio will then start in a special mode that allows step by step execution of the program, use features like BreakPoints, TracePoint, Watches, IntelliTrace, create MiniDumps and many more.

The debugger runs the program on the local machine, and uses the permissions of the locally logged on user.

Nothing out of the ordinary. Well, maybe the Reverse Debugging with IntelliTrace in VS2010, which is very cool.

 

Hardware Specific and CrapWare

I don't know about you, but I keep my development PC as stable as possible. I rarely install new software, so that I keep the overall performance stable over time. I will most of the time install new software versions only after having tested them on other PCs to determine their behavior.

Call me maniac, that's what it is :)

But then, what to do when the need for testing an installation program comes up ? Or when you need to debug plugins for NI TestStand or Labview ? Or when the software needs a very specific kind of hardware that cannot be installed on your development PC ? (Rainbow Keys, anyone ?)


The answer is simple : The Remote Debugger ! When possible, I will test and debug my software on a virtual machine, or on a physical machine that has the appropriate environment to execute the software.

That way, the development environment stays stable, and I don't need to make installation of software that could add some crapware and eat up the few bytes of RAM left :)

The Remote Debugger ?

The idea is to continue using the development machine, where the source code is and to connect via the network on a machine that will execute the program. After that, the remote debugging session is very similar to a local session, with the exception of the "Edit and Continue" that is not supported. But most of the time, we can live without it.

 

Running the debugger from Visual Studio

It is possible to run the execution on the remote machine by using the "Use Remote Machine" option in the "Debug" tab of a C# project. It is important to note that checking this option implies that all paths specified in "Working Directory" or "External Program" are those of the remote machine.

Aditionnally, Visual Studio will not copy the binaries and PDB files on the remote machine. You have to make the copy of the files at the appropriate location, by using a "Post Build Action", a UNC path in the form of "\\mymachine\c$\temp".

 

Attach to a Running Process

It is also possible to attach to a running process, by using the "Debug / Attach To Process" option. You just need to fill in the "Qualifier" and set the name of the remote debugger, and to choose the process to debug.

Quick hint: The option "Show processes from all users" is not enabled by default. This means that is you want to debug a Windows Service, you will not see it in the list until you enable it.

Finally, the "Attach To Process" window is also very useful with local processes. It is a very handy feature to create a memory dump of a process that takes too much memory, and analyze it.

 

Installing the Remote Debugger

The Remote Debugger is an additional Visual Studio component that is located on the installation media, in the "Remote Debugger" folder. Three versions exist : x86, x64 and ia64 (RIP, Itanium...). If you have to debug a 32 process on 64 bits machine, I advise that you install both the x86 and x64 versions. You will have to choose which remote debugger to run depending on the .NET runtime that is used. You can see which version to use in the "Type" column of the "Attach to Process" window.

Here's what to do :

  • If you are using VS2008 SP1, you can download it here, and for VS2010 you can use the install located on the DVD
  • Once installed on the remote machine, install the RDBG service with the wizard, using the LocalSystem account.
  • You may have a message about a security issue. If you do, follow these steps :
    • Open the "Local Security Policy" section of the "Administrative Tools" control panel
    • Go to the "Local Policies" / "Security Options"
    • Double click on "Network access: Sharing and security model for local accounts" and set the value to "Classic : Local users authenticate as themselves"
    • Close the window
  • If your machine is not on the same domain as your development machine, or even if it's not on a domain at all, add a local use account on the remote machine that has the same name as your current username, and make it a member of the administrators group. The password also has to be the same.
  • Start the remote debugger on the remote machine. Note that to debug a 32 bits process, you have to run the 32 Bits version of the debugger.
  • On the development machine, open the "Attach to process" window, and type the identifier of the remote debuger (shown on the remote debugger window). It should look like this: administrator@my-machine.

Note that the firewall on both the development and the remote machine can prevent the remote debugger from working properly. You can temporarily disable it, but make sure to enable it back after. If you only want to enable specific ports, the port 135/TCP is used. The Remote Debugger uses DCOM as its communication protocol.

 

And if my breakpoints stay empty red circles ?

This is a very common situation that means that the pdb files do not match the loaded binaries. Make sure that you've copied the pdb files at the same time you did the dlls.

The "Debug / Windows / Modules" shows if the debug symbols have been loaded properly, and if it's not the case, the "View / Output / Debug" window will most of the time show why.


Happy debugging !

About me

My name is Jerome Laban, I am a Software Architect, C# MVP and .NET enthustiast from Montréal, QC. You will find my blog on this site, where I'm adding my thoughts on current events, or the things I'm working on, such as the Remote Control for Windows Phone.