Writing custom queries against an Azure table

While not as powerful as SQL, Azure tables do allow you to do minimal querying. With the native SDK, you’d do this using the table_query object’s set_filter_string function. Here’s a modified ReadTableData method from the previous blog entries.

void ReadTableData(string_t filter)
{
  auto storage_account = cloud_storage_account::parse(
      U("UseDevelopmentStorage=true"));
  auto table_client = storage_account.create_cloud_table_client();
  auto table = table_client.get_table_reference(U("Clients"));
  bool created = table.create_if_not_exists();

  table_query query;
  query.set_filter_string(filter);

  auto results = table.execute_query(query);

  for (auto item : results)
  {
    auto properties = item.properties();

    for (auto property : properties)
    {
      ucout << property.first << U(" = ") 
            << property.second.str() << U("\t");
    }

    ucout << endl;
  }
}

Here’s an example query that gets all rows within a range of RowKey values.

void ReadTableData(string_t rowKeyStart, string_t rowKeyEnd)
{
  auto filter = table_query::combine_filter_conditions(
    table_query::generate_filter_condition(
        U("RowKey"), 
        query_comparison_operator::greater_than_or_equal, 
        rowKeyStart),
    query_logical_operator::and,
    table_query::generate_filter_condition(
        U("RowKey"), 
        query_comparison_operator::less_than_or_equal, 
        rowKeyEnd));

  ReadTableData(filter);
}

The combine_filter_conditions function is used to create a query string. The query_comparison_operator class allows you to set comparison operators and the query_logical_operator class lets you set logical operators. In case you are wondering, that gets converted to the following string.

(RowKey ge ‘100’) and (RowKey le ‘104’)

Here’s a similar method, that queries against the Name column.

void ReadTableDataStartingWith(string_t letter1, string_t letter2)
{
  auto filter = table_query::combine_filter_conditions(
    table_query::generate_filter_condition(
        U("Name"), 
        query_comparison_operator::greater_than_or_equal, 
        letter1),
    query_logical_operator::and,
    table_query::generate_filter_condition(
        U("Name"), 
        query_comparison_operator::less_than, 
        letter2));

  ReadTableData(filter);
}

The generated query filter looks like this.

(Name ge ‘D’) and (Name lt ‘E’)

You can call it as follows :

ReadTableDataStartingWith(U("D"), U("E"));

That’d return all rows with names starting with a D. Querying against partition and row keys would be a faster approach. Also, for repeated querying against a set of data, you may want to fetch a larger subset of the data and then query it in memory using standard STL.

Advertisements

Modifying Azure table data

The last blog entry showed how to read data from Azure storage tables. This blog entry will show how to insert, update, and delete data to and from Azure table storage.

Inserting data

void InsertTableData(string_t key, int id, 
    string_t name, string_t phone)
{
  auto storage_account = cloud_storage_account::parse(
      U("UseDevelopmentStorage=true"));
  auto table_client = storage_account.create_cloud_table_client();
  auto table = table_client.get_table_reference(
      U("Clients"));
  bool created = table.create_if_not_exists();

  table_entity entity(partitionKey, key);
  auto& properties = entity.properties();
  properties.reserve(3);
  properties[U("Name")] = entity_property(name);
  properties[U("Phone")] = entity_property(phone);
  properties[U("Id")] = entity_property(id);

  auto operation = table_operation::insert_entity(entity);
  auto result = table.execute(operation);
}

Azure table data is really just about properties. So what’s involved is creating a new table entity, setting up some properties (named keys with values), and then an insert_entity operation is executed against the table.

Updating data

void UpdateTableData(string_t key, int id, 
    string_t name, string_t phone)
{
  auto storage_account = cloud_storage_account::parse(
      U("UseDevelopmentStorage=true"));
  auto table_client = storage_account.create_cloud_table_client();
  auto table = table_client.get_table_reference(
      U("Clients"));
  bool created = table.create_if_not_exists();

  auto operation = table_operation::retrieve_entity(
      partitionKey, key);
  auto result = table.execute(operation);
  auto entity = result.entity();
  auto& properties = entity.properties();
  properties[U("Name")] = entity_property(name);
  properties[U("Phone")] = entity_property(phone);
  properties[U("Id")] = entity_property(id);

  auto operationUpdate = table_operation::replace_entity(entity);
  result = table.execute(operationUpdate);
}

This code is very similar. The retrieve_entity operation is used to access the entity we need to update. The property values are updated, and then a replace_entity operation is executed.

Deleting data

void DeleteTableData(string_t key)
{
  auto storage_account = cloud_storage_account::parse(
      U("UseDevelopmentStorage=true"));
  auto table_client = storage_account.create_cloud_table_client();
  auto table = table_client.get_table_reference(
      U("Clients"));
  bool created = table.create_if_not_exists();

  auto operation = table_operation::retrieve_entity(
      partitionKey, key);
  auto result = table.execute(operation);
  auto entity = result.entity();

  auto operationUpdate = table_operation::delete_entity(entity);
  result = table.execute(operationUpdate);
}

This is very similar to the update, except that a delete_entity operation is executed. Future blog entries will talk about accessing Azure blob storage using the native SDK.

Azure Storage access using C++

The Microsoft Azure Storage Client Library for C++ is a library built on top of the C++ REST SDK that lets you access Azure Storage from your C++ apps. You need to install it via NuGet. While installing it, I could not locate it using the NuGet packages UI in Visual Studio 2013. Instead I had to install it via the Package Manager Console. I assume it’s because the library is pre-release. The latest version is 0.3.0 (preview) dated May 16 2014. So when you install it, you need to add the -pre parameter.

install-package wastorage -pre

This will add the REST SDK and other dependencies like the wastorage.redist to your project. To test if it works, I created a table using a quickly put together C# project and then accessed it from a C++ console app. I just used the Azure storage emulator. Here are the includes you need to have in your project.

#include "../packages/wastorage.0.3.0-preview/build/native/include/was/storage_account.h"
#include "../packages/wastorage.0.3.0-preview/build/native/include/was/table.h"

You can also add this using namespace declaration to save some keystrokes.

using namespace azure::storage;

Here’s the code snippet, and I’ve added comments to make it easier to understand.

// This returns the storage account object
auto storage_account = cloud_storage_account::parse(U("UseDevelopmentStorage=true"));

// This returns the table client service object
auto table_client = storage_account.create_cloud_table_client();

// Now, we get a reference to a named table
auto table = table_client.get_table_reference(U("Clients"));

// This will create the table if it does not exist
bool created = table.create_if_not_exists();

// We get everything without a filter
table_query query;
auto results = table.execute_query(query);

// Each item represents a table entity
for (auto item : results)
{
  auto properties = item.properties();

  // Each property from the table entity is returned as a key-value pair
  for (auto property : properties)
  {
    ucout << property.first << U(" = ") << property.second.str() << U("\t");
  }

  ucout << endl;
}

Example output:

Id = 100        Name = John Brown       Phone = 777-1234
Id = 101        Name = Mary Jane        Phone = 777-5678

Now this was basically as easy as using C#. The one difference, and it’s a non-trivial one, is that with C# you can map the returned table_entity to a .NET type. So I can write something like this:

TableQuery<Client> query = new TableQuery<Client>();

foreach (var item in table.ExecuteQuery(query))
{
    Console.WriteLine("{0} - {1}", item.Name, item.Phone);
}

Here, Client is an entity object.

class Client : TableEntity
{
    public Client(int id)
    {
        this.Id = id;
        this.PartitionKey = id.ToString();
        this.RowKey = id.ToString();
    }

    public Client()
    {
    }

    public string Name { get; set; }
    public string Phone { get; set; }
    public int Id { get; set; }
}

I am guessing here, but it’s most likely done via reflection. With C++, you’d need to write plumbing code to map the returned data to your C++ objects. Not a big deal really, and more a matter of coding convenience. Probably more performant to do it the C++ way (that’s an unverified personal thought).

Using weak_ptr

The weak_ptr holds a weakly referenced pointer to an object that is managed by a shared_ptr (or by multiple shared_ptr instances). The weak_ptr does not affect the strong ref count. You typically construct a weak_ptr out of a shared_ptr, and then when you need to access the underlying object, you call lock() on the weak_ptr which gives you a shared_ptr (with ref count incremented). One use of weak_ptr types is to help avoid circular references, which often leads to memory leaks as objects continue to remain in memory. Another example would be cached access to objects that may or may not be alive in memory. So you’d store the weak_ptrs and whenever you need to access the object, you’d check to see if the object’s alive, and create a shared_ptr from the weak_ptr as needed. The code snippet below shows such a pattern.

class NumberStoreCache
{
private:
  unordered_map<int, weak_ptr<NumberStore>> cache;

  shared_ptr<NumberStore> AddToCache(int number)
  {
    shared_ptr<NumberStore> store = make_shared<NumberStore>(
        number);
    weak_ptr<NumberStore> weak(store);
    cache[number] = weak;
    return store;
  }

public:
  shared_ptr<NumberStore> GetNumberStore(int number)
  {
    if (cache.find(number) == cache.end())
    {
      return AddToCache(number); // call 1 
    }
    else
    {
      weak_ptr<NumberStore> weak = cache[number];
      if (weak.expired())
      {
        return AddToCache(number); // call 2
      }
      else
      {
        return weak.lock(); // call3
      }
    }
  }
};

void Foo()
{
  NumberStoreCache nsCache;
  auto ns1 = nsCache.GetNumberStore(10); // call 1
  ns1.reset();
  auto ns2 = nsCache.GetNumberStore(10); // call 2
  auto ns3 = nsCache.GetNumberStore(10); // call 3
}

Using shared_ptr

While unique_ptr is meant for single-owner scenarios, shared_ptr is the reference counted smart pointer class that allows you to share the smart pointer around your code. Consider the code snippet below, which uses the NumberStore example from the previous blog entry.

void SharedPtrVersion()
{
  shared_ptr<NumberStore> number(
      new NumberStore(100)); // Ref Count : 1
  Foo(number); // Ref Count : 1 on function return
  shared_ptr<NumberStore> copy = number; // Ref Count : 2
  Foo(number); // Ref Count : 2 on function return
  copy.reset(); // Ref Count : 1
  Foo(number); // Ref Count : 1 on function return
  Foo(copy); // This will crash
}

The output will be:

NumberStore ctor
Foo : 100 Ref Count : 2
Foo : 100 Ref Count : 3
Foo : 100 Ref Count : 2
NumberStore dtor

The ref count goes up inside Foo‘s body as Foo has received a copy of the shared_ptr. It goes back as soon as Foo returns. Creating a copy increments the ref count as expected. Calling reset() decrements the ref count and releases ownership. Trying to pass around a reset shared_ptr will give you a crash / access violation.

The shared_ptr class also allows you to pass a lambda as the deletion method called in the destructor, so you can do something like this.

void SharedPtrArrayVersion()
{
  shared_ptr<NumberStore> number(
    new NumberStore[3], 
    [](NumberStore* pNumStore) { delete[] pNumStore; });
}

Note: when you call reset, if the ref count drops to 0, the object is destroyed. Example:

void SharedPtrVersion2()
{
  shared_ptr<NumberStore> number(new NumberStore(100));
  shared_ptr<NumberStore> copy = number;
  number.reset();
  copy.reset(); // destructor is called here
  cout << "end of method" << endl;
}

The next blog entry will talk about the weak_ptr class (the last of the 3 smart pointers introduced in C++ 11).

Using unique_ptr instead of auto_ptr

I’ve had a bit of a blogging hiatus and hope to make amends for that. Going forward, I will continue to blog on modern C++ features and also focus on newer frameworks from Microsoft (primarily C++ focused, but non-C++ technologies that interest me will also be discussed).

For years, we’ve all used auto_ptr with all its pitfalls. A minor issue was its lack of support for an array of objects, so if you used it with with an array of objects, only the first one would get deleted. A much bigger issue is that the auto_ptr transfers ownership when it’s assigned to another auto_ptr. And because this happens in a non-obvious fashion, it’s fairly easy to introduce problems in your code. The unique_ptr solves both these problems. It has a partial specialization that correctly calls delete[] on an array of objects. It also emphasizes the point of unique ownership. So you have to explicitly move an unique_ptr into another unique_ptr.

Here are some code snippets that show this more clearly.

class NumberStore
{
  int _num;

public:
  NumberStore(int num = 0) : _num(num)
  { 
    cout << "NumberStore ctor" << endl;
  }

  ~NumberStore()
  {
    cout << "NumberStore dtor" << endl;
  }

  int Num()
  {
    return _num;
  }
};

void Foo(auto_ptr<NumberStore> number)
{
  cout << "Foo : " << number->Num() << endl;
}

void AutoPtrProblem()
{
  auto_ptr<NumberStore> number(new NumberStore(100));
  Foo(number);
  Foo(number); //This call will crash
}

The 2nd call to Foo(...) will crash because the object has already been destroyed. Now consider the unique_ptr version.

void Foo(unique_ptr<NumberStore> number)
{
  cout << "Foo : " << number->Num() << endl;
}

void UniquePtrVersion()
{
  unique_ptr<NumberStore> number(new NumberStore(100));
  // Foo(number); <-- This won't compile
  Foo(move(number));
  Foo(move(number)); //This is an obvious programmer error now
}

By forcing you to explicitly call move, the compiler makes it obvious that a transfer of ownership is happening there. It’ll take some conscious bad programming to create a similar crash as before. As for an array of objects, the following method will call the destructor thrice.

void UniquePtrArrayVersion()
{
  unique_ptr<NumberStore[]> number(new NumberStore[3]);
}

So, going forward if you think you have a need for a single-owner smart pointer class, unique_ptr is the one to use, not auto_ptr. For other situations, you still do not use auto_ptr, instead you’d use shared_ptr which I’ll be blogging about shortly.

VC++ 2013 – Initializer lists and uniform initialization

We’ve always been able to use initializer lists with arrays, now you can do it with any type that has a method that takes an argument of type std::initializer_list<T> (including constructors). The standard library collections have all been updated to support initializer lists.

void foo()
{
  vector<int> vecint = { 3, 5, 19, 2 };
  map<int, double> mapintdoub =
  {
    { 4, 2.3},
    { 12, 4.1 },
    { 6, 0.7 }
  };
}

And it’s trivial to do this with your own functions.

void bar1(const initializer_list<int>& nums) 
{
  for (auto i : nums)
  {
    // use i
  }
}

bar1({ 1, 4, 6 });

You can also do it with your user defined types.

class bar2
{
public:
  bar2(initializer_list<int> nums) { }
};

class bar3
{
public:
  bar3(initializer_list<bar2> items) { }
};

bar2 b2 = { 3, 7, 88 };

bar3 b3 = { {1, 2}, { 14 }, { 11, 8 } };

Uniform initialization is a related feature that’s been added to C++ 11. It automatically uses the matching constructor.

class bar4
{
  int x;
  double y;
  string z;

public:
  bar4(int, double, string) { }
};

class bar5
{
public:
  bar5(int, bar4) { }
};

bar4 b4 { 12, 14.3, "apples" };

bar5 b5 { 10, { 1, 2.1, "bananas" } };

If there’s an initializer-list constructor, it takes precedence over another matching constructor.

class bar6
{
public:
  bar6(int, int) // (1)
  {
    // ...
  }

  bar6(initializer_list<int>) // (2)
  {
    // ...
  }
};
  
bar6 b6 { 10, 10 }; // --> calls (2) above