RESTFul Web Services is one of the older books on how engineers should design and build a REST API. It was released in 2007, which in Internet years is eons ago. It was one of the books I read while getting up to speed with different facets of RESTful architecture while helping to build a Java JSR-311 JAX-RS implementation. Sometimes I find it interesting to re-read older books to see if the authors were able to correctly predict “the future”, if the ideas “stood the test of time”, and if anyone followed their advice.

Resource-Oriented Architecture (ROA)

The main discussion of the book centers around Leonard Richardson and Sam Ruby’s Resource-Oriented Architecture (ROA), which has 4 concepts:

1. Resources
2. Their names (URIs)
3. Their representations
4. The links between them

I don’t know if it was revolutionary thinking at the time, but it certainly seemed to clarify what Dr. Roy Fielding wrote in his dissertation into a simple 4 item list. It was not a claim that ROA was the only way to build a RESTful service, but the book is a pragmatic way to break down how to build one.

In my experience, most self-proclaimed RESTful APIs usually get the first 3 concepts correct to some degree, but generally outright fail with the fourth concept, “The links between them”. Fielding called out that REST APIs must be hypertext-driven in 2008 yet it still does not seem many APIs follow that design principle. Part of the problem may be that JSON, which has become the default API media type, is not hypertext-driven even though there are some JSON API standards which try to improve the situation. Another solution with link headers seems to come and go in style.

There were 4 properties of ROA which were also declared at the end of Chapter 4:

1. Addressibility
2. Statelessness
3. Connectedness
4. A uniform interface

Most RESTful web services get at least Addressibility and “A uniform interface” right. It is rare to see overloaded RPC-style POST method operations in a “RESTful” API anymore. Most RESTful APIs understand the pragmatic difference between PUT vs. POST. Stateleness is usually achieved with some caveats. Connectedness is perhaps the most difficult of the listed properties to implement.

All things considered, it seems that the authors did outline how a RESTful architecture should be designed and implemented correctly, and most services are trying to achieve all the properties of a RESTful service.

Somewhat Uncommon, Yet Still Interesting Ideas

URIs containing content language and content type

The recommendation to prefer the URI to contain the content type (e.g. end in “.json”) and content language (e.g. end in “.en.html” to represent an English HTML document) versus the Accept and Accept-Language header was interesting. While perhaps not uncommon in some frameworks, other frameworks have totally gone all-in for HTTP headers to determine the resource representation and media type.

After having to explain over the years about how to get JSON vs. XML with an Accept header and how to manually do content negotiation on the server side, the resource content type in the URI is great pragmatic advice which I wish all frameworks would at least support.

It is easy to copy and give a URI to another person to get the same representation in the same media type. Asking someone to launch their non-browser client and specifying a HTTP header is one of the most inconvenient ways to try an API.

POST(a) for appending to a standalone resource

The authors make a distinction between POST append (POST(a)) vs. POST overloaded to distinguish between using POST as part of a uniform interface vs. using POST as a way to introduce RPC methods.

What was more interesting is that many APIs today use collection factory resources that are POSTed to and which then create child resources. However, another way to use HTTP POST is to append to a standalone resource itself vs creating a child resource. While I believe it is uncommon, it is still a valid RESTful usage of POST.

Comma for Ordered, Semi-colon for Unordered

While I have not seen too many commas or semi-colons in URIs lately, the proposed convention is nice to have for un-keyed parameters. Practically though, it seems everyone has adopted named query parameters. Matrix parameters seem like a forgotten URI convention.

Create More Resources

Many RESTful APIs take a simple approach and expose database tables as resources. However, many times the uniform interface is not adequate to describe all the operations on a resource. Instead of overloading POST and becoming an RPC-style service, the general recommended solution is to create more resources.

For instance, instead of creating a transaction RPC operation, you can create a transaction resource to represent all of the changes and then modify its state to be committed to execute the transaction.

While this approach generally works for simple transactions, queued jobs, etc., it may not be suitable in a more complex system.

The Book Wasn’t Always Right

While the overall ROA design discussion still holds up today, there are a couple of ideas that have grown out of favor as time has gone by. ATOM and XML in general are now secondary media types to JSON for APIs. WADL has not gained any traction over the years either.

Things Only Briefly Mentioned

Service versioning schemes, complex transactions, and batch operations are each briefly mentioned, yet are usually hot topics for any mature RESTful API.

Each are difficult areas to design which I hope is more thoroughly discussed in other books…

There’s a Sequel

What prompted me to re-read this book was that I found out there was a sequel called RESTful Web APIs published in late 2013. I wanted to have a quick read to compare how the authors thoughts changed and how they were revised as a significant amount of time passed. Hopefully I’ll be able to write another blog post in the future about the follow-up book.

Disclaimer: These are some practical notes about the code generation in Magento 2 so there are edge cases and details that I will gloss over. Note that I did not architect, design, or write any of the relevant code but I do work on the Magento 2 codebase. This is not an official Magento 2 doc. You should find more detailed information at the official Developer Docs.

Several referenced classes in the Magento 2 codebase do not exist in the GitHub repository. For instance, look at the \Magento\Customer\Model\Resource\AddressRepository constructor.

<?php
...
    public function __construct(
        \Magento\Customer\Model\AddressFactory $addressFactory,
...

The first constructor parameter has a type of \Magento\Customer\Model\AddressFactory. However, this class does not exist in the \Magento\Customer\Model directory. You cannot find the class in the git repository.

How does Magento 2 work with these missing classes?

Code Generation

Code is generated when the application is run. Simply, if Magento 2 cannot find a class and the name of the class falls within a recognized convention (e.g. it ends with Factory), Magento 2 will generate the class.

Where is the generated code?

The generated code will be put in your MAGENTO_2_HOME/var/generation directory. Issue a web request to an installed Magento 2 instance, and you will see new files in that directory.

Unlike some other languages/libraries, you can look at the generated code on disk to see what really happens and still debug through the code.

When is code automatically generated?

For most people in a non-production mode (e.g. by default after a git clone), code is generated when Magento 2 cannot find the class during code execution.

However, you can generate all of the code as well (instead of on demand) which would be useful for production by using the \Magento\Tools\Di\compiler.php script.

Why generate code?

There’s boring code, and then there’s interesting code. Code generation writes the boilerplate code to allow developers to write the more exciting code.

Code generation is useful when the logic follows a pattern but you need to have specific logic for a particular class.

Factory

For instance, a Factory class creates instances of a type. So a generated \Magento\Customer\Model\AddressFactory creates new instances of \Magento\Customer\Model\Address. The actual code in AddressFactory has some specific code for the Address type.

Proxy

In more complex code generation, a Proxy can be generated for a type. Generally, a proxy must have an implementation of all the declared public methods of the original class.

The method implementation could be delegating to another object in memory. Or the method could make a network call to another object on a different machine. All the Proxy methods in a class usually do the same thing (e.g. they all delegate to another object or they all make a network call) except they need a slight difference to call a specific method.

In a practical example, you can see the StoreManager class and then see the generated StoreManager Proxy class.

Advantages of Generating Code

In both Factories and Proxies, the code can be simple to write. But do you really want a developer to spend time writing tedious code?

By generating the code, you can be assured:

  • The code is correct. You won’t have to worry that the generated code is delegating to the wrong method or forgetting a semicolon. You don’t have to write tests for the generated code. (Yes, this assumes the code generation is correct ☺ ; Magento 2 code generation is for well understood patterns and the code generator itself is tested).
  • Consistency in implementation. All generated Factories work the same way. So once you know how one Factory works, you should know how they all work.
  • Ability to change the implementation for all generated code. If you discover a better way of implementing a Proxy, you can do it across the board. If you want the code generator to use a PHP __call magic method or if you want real methods, you just need to change the generator. The maintenance of code is reduced.

But I need to write my own Factory!

Write the code in your module. If the class really exists, then code will not be generated even if the class name matches a convention.

However, if you do not have custom logic, it is a best practice to use the code generation whenever possible.

When is code regenerated?

If the class does not exist and is missing from var/generation, then the code generation will be called. So delete the var/generation directory, and code should be generated again.

When should I regenerate the code?

The practical developer advice is to regenerate whenever you update your Magento 2 code. rm -rf var/generation and other config cache directories.

Why should I regenerate the code?

Doesn’t this stuff work the first time? ☺

Suppose a Customer\Proxy class for a Customer class is generated. The Customer class has new methods added to it. Because a Customer\Proxy exists in var/generation, it will not be re-generated. However, the Customer\Proxy implementation is incomplete now because it does not have the new methods. You need to regenerate the Customer\Proxy class.

On a rare occcasion, the code generator implementation is changed itself which means you should regenerate all the classes.

Personally, I just delete the var/generation directory (among other cache directories) whenever I update the Magento 2 code. I don’t want to waste my time on weird issues just because my generated code is outdated.

How is the code generated?

A good starting point is to look at \Magento\Framework\Code\Generator and \Magento\Framework\ObjectManager\Code\Generator for the code generation implementation.

More specifically, you may want to check out the \Magento\Framework\Code\Generator\Autoloader.

Why are there Proxy and Factory classes in the lib?

Code generation is only intended for application module code and not the framework, so you will see code for factories and proxies in the Magento\Framework.

What are some things that can be generated?

Factories

Factories are useful for creating objects. In general, use a factory whenever you need to create non-singleton objects in your code.

Proxies

Proxies lazily instantiate objects and break cyclical dependency cycles. Normally you should never need to reference a proxy directly in code. You usually explicitly configure a Proxy via a di.xml (the dependency injection config file) as one of the constructor arguments to a class.

Interceptors

If you write a Plugin, an Interceptor is generated for the real implementation class to make the Plugin work. I rarely look at these generated classes and look at the Plugin code mode. But the logic is quite interesting and help unravel how Plugins work.

Questions?

Ask on the Magento Stack Exchange. You could also raise an issue on the Magento 2 GitHub tracker.

System.out.print("Hello World!");
NSLog(@"Hello World!");
alert("Hello World!")
<?php
echo "Hello World!";
puts "Hello World!"
print("Hello World!")