I am excited to announce the launch of my new video series, “Sr. Developer in 24 Hours”. This series is designed for developers who are looking to level up their coding skills and transition to a senior developer role. The material will span a total of 24 hours.
Throughout my career, mentoring has been a key aspect of my role, and one question that frequently comes up from junior developers is, “How can I make the leap to becoming a senior developer?” This series is my effort to compile and organize this information.
The series will consist of weekly 30-minute videos, covering various skills, techniques, and technologies that are essential for senior developers to excel in their teams. Topics will include setting up new projects, optimizing team productivity, incorporating non-functional requirements into applications, and much more. The content will also be compiled into a living ebook for easy reference.
Please note that this video series is targeted towards developers with some prior experience in the field.
Thank you, and I look forward to sharing this valuable content with you all!
The first video is available at https://www.youtube.com/watch?v=apzQpX139vQ.
All the videos will be listed here.
]]>In dry terms, ThreadLocal
provides per thread variables. What does that mean? If you look at the API, you will see it only has 4 methods:
get()
To get the value associate with the current thread.set(T value)
To set the value associate with the current thread.remove()
To remove the value associate with the current thread.initialValue()
To return a value when we invoke get()
without having set a value previously.Notice a pattern? Basically ThreadLocal
allows us to store a value that will only be available to the current thread. Other threads will have access to their own values.
You can also think of ThreaLocal
as a Map
, where the key is the current Thread
.
ThreadLocal
allows to keep context on a per-thread basis.
In other words we can use ThreadLocal
to access information without explicitly passing a “context” object through the call stack.
This aligns very well with the model used by Servlet application servers, in which each request is handled by a single thread. In such cases we can use ThreadLocal
as a thread global variable.
The truth is that even if you have never used ThreadLocal
directly, there’s a good chance that a framework you use depends on ThreadLocal
. To name a few examples:
@Transactional
work)ThreadLocalSessionContext
)Any of those rings a bell? If so, well, then your application already depends heavily on ThreadLocal
, and there’s an enormous value in understanding its mechanism.
The simplest case in which we use ThreadLocal
is to provide safe access to non thread-safe objects in a thread-safe way without having to use synchronization.
One of the main culprits is SimpleDateFormat
, which is relatively expensive to create, and is not thread-safe. With ThreadLocal
we can access it safely:
public class TestController {
private static final ThreadLocal<SimpleDateFormat> dateFormatHolder =
ThreadLocal.withInitial(() -> new SimpleDateFormat("MM/dd/yyyy hh:mm:ss a"));
public String verify() {
SimpleDateFormat sdf=dateFormatHolder.get();
System.out.println("Request received: " + sdf.format(new Date()));
return "Ok";
}
}
In this example dateFormatHolder
will hold a different instance of SimpleDateFormat
for each thread.
Calling dateFormatHolder.get()
will return the instance of SimpleDateFormat
already associated to the current thread, or if one doesn’t exist, it will be created with the associated Supplier
lambda.
In this example, no class outside of our TestController will have access to our dateFormatHolder
. You could also move dateFormatHolder
to a util class, to allow multiple classes to have access to a thread specific SimpleDateFormat
.
Spring Security provides a series of Filters that wrap every request that the application server (such as Tomcat) receives.
Take a look at the (somewhat abridged) SecurityContextPersistenceFilter
from SpringSecurity.
public class SecurityContextPersistenceFilter extends GenericFilterBean {
public void doFilter(ServletRequest req, ServletResponse res, FilterChain chain)
throws IOException, ServletException {
HttpServletRequest request = (HttpServletRequest) req;
HttpServletResponse response = (HttpServletResponse) res;
HttpRequestResponseHolder holder = new HttpRequestResponseHolder(request,
response);
SecurityContext contextBeforeChainExecution = repo.loadContext(holder); // 1
try {
SecurityContextHolder.setContext(contextBeforeChainExecution); // 2
chain.doFilter(holder.getRequest(), holder.getResponse()); // 3
}
finally {
SecurityContext contextAfterChainExecution = SecurityContextHolder
.getContext();
// Crucial removal of SecurityContextHolder contents - do this before anything
// else.
SecurityContextHolder.clearContext(); // 4
}
}
public void setForceEagerSessionCreation(boolean forceEagerSessionCreation) {
this.forceEagerSessionCreation = forceEagerSessionCreation;
}
}
The filter follows a simple order of events:
SecurityContext
(what roles do I have access to )SecurityContext
in the SecurityContextHolder
We’re going to skip SecurityContextHolder
, since it basically delegates to a SecurityContextHolderStrategy
.
Instead we’re going to take a look at the default SecurityContextHolderStrategy
, which is appropriately named ThreadLocalSecurityContextHolderStrategy
.
final class ThreadLocalSecurityContextHolderStrategy implements
SecurityContextHolderStrategy {
private static final ThreadLocal<SecurityContext> contextHolder = new ThreadLocal<SecurityContext>();
public void clearContext() {
contextHolder.remove();
}
public SecurityContext getContext() {
SecurityContext ctx = contextHolder.get();
if (ctx == null) {
ctx = createEmptyContext();
contextHolder.set(ctx);
}
return ctx;
}
public void setContext(SecurityContext context) {
Assert.notNull(context, "Only non-null SecurityContext instances are permitted");
contextHolder.set(context);
}
public SecurityContext createEmptyContext() {
return new SecurityContextImpl();
}
}
This class is very simple, and for the most part is nothing but an adaptor to map ThreadLocal
to a SecurityContextHolderStrategy
.
So how is this used? Imagine we have a service we want to secure:
@Component
public class AdminService {
@Secured("ADMIN")
public void admin(Model model) {
System.out.println("Got new model");
}
}
Annotations like @Secured
are normally implemented using AOP and proxies, which are examined here and here.
However the important concept is that there’s an interceptor which is going to be executed before the actual admin(ModelMap model)
method, and it’s the interceptors responsability to determine if the user has the “ADMIN” role.
If the user has the role, execution continues, otherwise an exception is thrown.
Let’s look AbstractSecurityInterceptor
,
which is part of Aspect Oriented Programming (AOP) system allows use to the @Secured
annotation:
public abstract class AbstractSecurityInterceptor implements InitializingBean,
ApplicationEventPublisherAware, MessageSourceAware {
private AccessDecisionManager accessDecisionManager;
protected InterceptorStatusToken beforeInvocation(Object object) {
Authentication authenticated = authenticateIfRequired(); //1
// Attempt authorization
try {
this.accessDecisionManager.decide(authenticated, object, attributes); //2
}
catch (AccessDeniedException accessDeniedException) {
publishEvent(new AuthorizationFailureEvent(object, attributes, authenticated, //3
accessDeniedException));
throw accessDeniedException;
}
//Removed InterceptorStatusToken generation code for brevity
}
private Authentication authenticateIfRequired() {
Authentication authentication = SecurityContextHolder.getContext()
.getAuthentication();
//Removed code to try to re-authenticate for brevity
return authentication;
}
}
The Interceptor operates in three steps:
Authentication
object from SecurityContextHolder
, which will delegate to one the strategies, for example ThreadLocalSecurityContextHolderStrategy
, which will retrieve the Authentication
data from ThreadLocal
as we saw above.Authentication
will be checked against the attributes of the method we’re are calling (basically who can run this method).accessDecisionManager
throws an AccessDeniedException
, the interceptor will re throw it, therefore preventing unauthorized execution of the method.What did we achieve with ThreadLocal
? We’re able to decouple setting the authorization information (which happens in a Filter
), with actually using this information in our business classes. Pretty cool, isn’t it?
ThreadLocal
will not protect you in any way.ThreadLocal
won’t be useful to pass context around.ThreadLocal
object is important. Normally ThreadLocal
is declared as a static
field to ensure we keep only a single copy. Using static
also enables easy access to the field from other classes without the need to access the instance which has the reference to the ThreadLocal
.ThreadLocal
variables become candidates for Garbage Collection after their associated thread is garbage collection.ThreadLocal
code can be difficult to test. If possible, wrap your ThreadLocal
variable in an managed object and inject it with your Dependency Injection framework.Last Thursday I had the pleasure of talking at “Nerd Interface”, a meetup sponsored by my employee that covers many exciting topics like Virtual Reality, IoT (Internet of Things), web/mobile development, and user experience.
Here’s the synopsis of the talk:
Have you ever wondered how does Facebook know if you are “Liberal”, “Conservative”, or “Moderate” without explicitly asking you?
How Amazon detects fake reviews of its products?
How do the “Twitter Funds” decide what stocks to buy or sell?
How can companies identify detractors and promoters?
In the age of information overload, automated tools are the only way of keeping up with deluge of data generated every second. Python’s Natural Language Toolkit (NLTK) is one such tool. Natural Language Processing (NLP) and machine learning allows algorithms to extract useful and insightful information from free form text. During this presentation we’ll see a live demonstration of the sentiment analysis functionality provided by NLTK, and how it computationally identifies and categorizes opinions straight from one of main content sources of our era: Twitter. We’ll also examine clustering, one of the most common forms of unsupervised learning. Clustering allows us to process large quantities of texts and group similar texts together, without user intervention.
The event was broadcast through Facebook Live, and you can see the recording here:
The presentation is available here.
You can also find the source code here. I encourage everyone to take a look at the code, particularly the two Python files. It’s very concise and easy the understand. If you have any questions, don’t hesitate to ask.
]]>In part one we examined how the behavior associated to annotations is injected into the application. In the next few sections we’ll see how the implementation of the behavior actually works.
Dynamic Proxies are a special kind of object. These object implement one or more interfaces, but their behaviour is defined not by a class but by an InvokatinHandler
.
Any call to any of the methods of the declared interfaces will be dispatched to the InvocationHandler. Dynamic Prox
For example, a very simple InvocationHandler would look like this:
import java.lang.reflect.InvocationHandler;
import java.lang.reflect.Method;
public class TestInvocationHandler implements InvocationHandler {
private final Object target;
public TestInvocationHandler(Object target) {
this.target = target;
}
@Override
public Object invoke(Object proxy, Method method, Object[] args) throws Throwable {
long startTime = System.nanoTime();
System.out.println("Starting profiling");
try {
return method.invoke(target, args);
} finally {
long endTime = System.nanoTime();
System.out.println("Execution time: " + (endTime - startTime));
}
}
}
The code to actually create the proxy looks like this:
import java.lang.reflect.Proxy;
public class DynamicProxyTest implements Runnable{
public static void main(String... args) {
DynamicProxyTest object = new DynamicProxyTest();
Runnable proxy = (Runnable)Proxy.newProxyInstance(
DynamicProxyTest.class.getClassLoader(),
new Class[]{Runnable.class},
new TestInvocationHandler(object));
proxy.run();
}
public void run() {
System.out.println("Test");
}
}
Keeping in mind that Dynamic Proxies are special mechanisms buried deep in the JVM, here are some useful properties to keep in mind:
proxy.getClass(): class com.sun.proxy.$Proxy0
proxy instanceof Runnable: true
Runnable.class.isAssignableFrom(proxy.getClass()): true
Dynamic Proxies were only first introduced in Java version 1.3, and even then their performance in the early days was far from ideal. Before Dynamic Proxies became a truly viable option, other mechanisms were devised by the Java community to apply these kinds of behaviours (even before the advent of annotations in Java 5). The solution was to create Java classes on the fly, to provide the desired functionality. To achieve this, libraries were created to manipulate Java bytecode, which is the instruction set of the Java virtual machine. The most established byte code manipulation libraries are ASM and CGLIB. However newer and easier to use libraries have recently emerged, such as Javassist
Bytecode manipulation ranges from very low level using JVM opcodes to more higher level mechanisms such as CGLIB’s Enhancer.
The Enhancer will generate a dynamic subclass, which enable dynamic interception. If you ever see a class that ends in $$EnhancerByCGLIB
in your debugger, you’ll know that you’re working with a class that was “Enhanced” by CGLIB.
The following example enhances a class, by writing the output in HTML paragraph tags: <p>
and </p>
.
import net.sf.cglib.proxy.Enhancer;
import net.sf.cglib.proxy.FixedValue;
import net.sf.cglib.proxy.InvocationHandler;
import net.sf.cglib.proxy.MethodInterceptor;
import java.util.function.Supplier;
public class EnhancerTest implements Supplier<String> {
public static void main(String... args) {
System.out.println(new EnhancerTest().get());
System.out.println(testInvocationInterceptor().get());
}
public static Supplier<String> testInvocationInterceptor() {
Enhancer enhancer = new Enhancer();
enhancer.setSuperclass(EnhancerTest.class);
enhancer.setCallback((MethodInterceptor) (obj, method, args, proxy) -> {
if("get".equalsIgnoreCase(method.getName())
&& method.getReturnType() == String.class) {
return "<p>" + proxy.invokeSuper(obj, args) + "</p>";
} else {
return proxy.invokeSuper(obj, args);
}
});
return (Supplier) enhancer.create();
}
@Override
public String get() {
return "Hello Test!";
}
}
The output will look like this:
Hello Test!
<p>Hello Test!</p>
So how does a framework like Spring implement functionality such as transaction propagation and security? In many case the secret behind the proxies is ThreadLocal!
In transaction management for example, before entering a method marked as @Transactional, the proxy code will check in a ThreadLocal
variable to see if there’s a transaction in progress. If there is, it will be reused (reentrant transactions), otherwise a new one will be created and stored in ThreadLocal
for access down the stack.
Security annotations work in a similar fashion, with the authentication code storing the authorization information in ThreadLocal
. This information is it’s accessible by the proxy code wrapping methods annotated with @RolesAllowed.
While this is not a discussion about ThreadLocal
it’s worth mentioning that as the name implies the information stored in such a variable is only available to code executing in the same thread. Special logic must take care of propagating this information to other threads in the JVM, or to a remote thread in case of RMI.
For more information on how ThreadLocal
works, take a look at this post here: ThreadLocal, and how it holds apps together.
Despite their bad rep in the early days, JDK dynamic proxies have performance that is within the realm of bytecode generation.
Nowadays the decision between using JDK Dynamic Proxies or one of the code generation libraries is not clear cut.
For example Spring will fallback to JDK Dynamic Proxies if no code generation library is available (and as long a you’re only injecting interfaces).
Regardless of the chosen method, just keep in mind that behind the covers, most moder web application are using plenty of proxies for a variety of critical functionality!
Thanks for reading, and if you have any questions or comments, feel free to drop a note in the comment section.
]]>One of the questions I get every so often is:
Where do I put the behavior for an annotation I just created?
The answer to this question is fairly complicated. First, because annotations don’t have any behavior. They’re just metadata. Second, because most answers will normally lead to something along the lines of:
But annotations do seem to have behavior, just look at all the stuff that happens when I mark a method @Transactional, @RolesAllowed, etc…
And that’s a very valid point, some annotations do appear to have behavior. I recently even heard people half jokingly reference to this as “Annotation Driven Development”. But in reality, annotations are just the magic dust that we sprinkle in our code, indirectly causing all kinds of black magic to happen on the background.
The truth is that this service class that you annotated with with @Transactional
will only exhibit transactional behavior in very particular cases.
Instantiating your class using the new
operator will NOT add transactional behavior regardless of the presence of @Transactional
annotations.
So how does it actually work? Let’s grab our magic dust and fly away to Neverland to find out…
Aspect-oriented programming or AOP is a programming paradigm that allows separation of cross cutting concerns from core concerns. Cross cutting concerns are applied based on point cuts.
While AOP never lived up to the hype it generated at some point in the past, some of its patterns have been widely used by frameworks to apply framework provided functionality to existing code without the need of explicit invocation.
The behavior is really injected through means separate from the annotations, the annotations really just mark where should be done. We’ll now look at how it’s done at the Dependency Injection (DI) level, through Java Agents at runtime, and during the compile phase by using compiler annotation processors.
Your favorite dependency injection framework is normally one of the ways in which annotations are associated with actual behavior. Basically the DI container will wrap the actual bean before injecting it.
We’ll see examples in the three (arguably) most popular DI frameworks. The examples will turn strings into HTML (basically just wrap the string in <p></p>
tags).
A sample service that will be instrumented looks like this:
@HTMLPrettify
public class TestService {
public String buildMessage() {
return "Hello World!";
}
}
Calling testService.buildMessage()
will return <p>Hello World!</p>
if the interceptor is set up correctly.
In this case the core concern is building the message Hello World
, while the cross cutting concern is applying some format to it. We can apply the same cross cutting concern to multiple core concerns without changing either. We just have to define the right pointcuts.
In newer versions of Spring, adding an aspect is relatively easy:
import org.aspectj.lang.ProceedingJoinPoint;
import org.aspectj.lang.annotation.Around;
import org.aspectj.lang.annotation.Aspect;
import org.springframework.stereotype.Component;
@Component
@Aspect
public class HTMLPrettifyAdvice {
@Around("@annotation(html)")
public Object prettify(ProceedingJoinPoint pjp, HTML html) throws Throwable {
return "<p>" + pjp.proceed() + "</p>";
}
}
@Component
The class is marked as component to be added to the Spring container@Aspect
The class must be marked as an aspect, using the standard Aspect4J annotations@Around
The aspect is marked as an Around
concern, and the pointcut is defined as classes annotated with @HTML
For the Aspect to work, Spring must be configure to enable AspectJ proxies:
@Configuration
@ComponentScan("com.andresolarte.harness.spring4")
@EnableAspectJAutoProxy
public class Spring4TestConfig {
}
CDI (Context and Dependency Injection) frameworks use the concept of Interceptor
.
These interceptors can do a lot of things, including providing aspect advice.
A simple interceptor will look like this:
import javax.interceptor.AroundInvoke;
import javax.interceptor.Interceptor;
import javax.interceptor.InvocationContext;
@Interceptor
@HTMLPrettify
public class HTMLPrettifyInterceptor {
@AroundInvoke
public Object prettifyOutput(InvocationContext invocationContext)
throws Exception {
return "<p>" + invocationContext.proceed() + "</p>";
}
}
@Interceptor
We mark the class as an interceptor, this will let CDI know the special purpose of this class.@HTMLPrettify
This is the actual annotation that we want to provide behavior for.@AroundInvoke
Indicates the type of advice. This allows us access change the paramters used to invoke the underlying logic, change the return value, or return something completely different.The annotation that we’re going to bind to has a few special requirements:
import javax.interceptor.InterceptorBinding;
import java.lang.annotation.ElementType;
import java.lang.annotation.Inherited;
import java.lang.annotation.Retention;
import java.lang.annotation.RetentionPolicy;
import java.lang.annotation.Target;
@Inherited
@InterceptorBinding
@Retention(RetentionPolicy.RUNTIME)
@Target({ElementType.TYPE,ElementType.METHOD})
public @interface HTMLPrettify {
}
@Inherited
Meta-annotation that indicates that an annotation type is automatically inherited to subclasses.@InterceptorBinding
Specifies that the annotated type should be associated with an interceptor@Retention
Should normally be RUNTIME
to ensure that the annotation is available to the JVM time. The default is CLASS
To enable the interceptor, it must be defined in the beans.xml
file. This is needed to be able to define the order in which the interceptors will fire.
<beans xmlns="http://xmlns.jcp.org/xml/ns/javaee"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://xmlns.jcp.org/xml/ns/javaee
http://xmlns.jcp.org/xml/ns/javaee/beans_1_1.xsd"
bean-discovery-mode="all">
<interceptors>
<class>com.andresolarte.harness.cdi.interceptor.HTMLPrettifyInterceptor</class>
</interceptors>
</beans>
Sometime the classes that we want to intercept don’t have our special annotation. In those cases, an extension provides a way to add annotation to class definitions at runtime.
Let’s say for example that we want to instrument third party classes that are annotated with the third party annotation @HTML
.
We lack the means to modify the annotation or the classes. Extensions come in to help us.
An extension can observe container lifecycle events and react to them. For example we can observe when the CDI context discovers a class or interface.
public class HTMLAnnotationProcessor implements Extension {
<T> void processAnnotatedType(@Observes @WithAnnotations({HTML.class}) ProcessAnnotatedType<T> pat) {
AnnotatedTypeBuilder annotatedTypeBuilder = new AnnotatedTypeBuilder()
.readFromType(pat.getAnnotatedType())
.addToClass(new AnnotationLiteral<HTMLPrettify>() {
});
AnnotatedType<T> type= annotatedTypeBuilder.create();
pat.setAnnotatedType(type);
}
}
We’re also leveraging the AnnotatedTypeBuilder
which is part of the Apache DeltaSpike project.
This facilitates adding our @HTMLPrettify
annotation to the recently discovered type.
CDI extensions use the standard Java extension mechanism, and need to be declared in file named javax.enterprise.inject.spi.Extension
in the META-INF/services
directory.
This file will simply list the FQN of the extensions we want to use:
com.andresolarte.harness.cdi.processor.HTMLAnnotationProcessor
After doing this, the any @HTML
annotated objects created by the CDI container will behave as if they had @HTMLPrettify
annotation.
Google Guice uses the standard MethodInterceptor
interface defined in the core AOP library:
import org.aopalliance.intercept.MethodInterceptor;
import org.aopalliance.intercept.MethodInvocation;
public class HTMLPrettifyInterceptor implements MethodInterceptor {
public Object invoke(MethodInvocation invocation) throws Throwable {
Object result = invocation.proceed();
if (invocation.getMethod().getReturnType() == String.class) {
System.out.println("Prettifying Output call to method: "
+ invocation.getMethod().getName());
result = "<p>" + result + "</p>";
}
return result;
}
}
Interceptors (the cross-cutting concern) will need to be registered in one of your modules, including a Matcher
specifying which objects to intercept (the point-cut definition).
public class TestModule extends AbstractModule {
@Override
protected void configure() {
bindInterceptor(Matchers.any(), Matchers.annotatedWith(HTMLPrettify.class),
new HTMLPrettifyInterceptor());
}
}
In this case we’re binding the interceptor to any class that has a method annotated with @HTMLPrettify
.
Java agents can be configured at startup, and provide mechanisms to change classes as they’re being loaded by the JVM. These agents exist on a layer that is closer to the bare metal than the class loader.
public class TestAgent {
public static void premain(String agentArguments,
Instrumentation instrumentation){
ClassFileTransformer transformer=new TestTransformer();
instrumentation.addTransformer(transformer);
}
public static class TestTransformer implements ClassFileTransformer {
public byte[] transform(ClassLoader loader,
String className,
Class<?> classBeingRedefined,
ProtectionDomain protectionDomain,
byte[] classfileBuffer) throws IllegalClassFormatException {
byte[] ret=classfileBuffer;
if (shouldTransform(className)) {
ret=transform(classfileBuffer);
}
return ret;
}
}
}
Java agents provide access to the raw bytes that make up the class. The same bytes you would see in a .class
file.
While you’re welcome to modify the byte array by hand, it’s more practical to use a byte code manipulation library.
A complete example using Javassist can be seen here.
Java agents must be specified in the command line when starting the JVM process:
java -javaagent:myagent.jar -jar myapp.jar
Annotation processors are executed by the standard java compiler javac
. These processors allow extensions to apply validation rules or generate new resources. Tools such as Dagger and Lombok make their magic happen using annotation processors.
Compile time annotation processors are registered using the standard Java extension mechanism. These extensions are declared in file named javax.annotation.processing.Processor
in the META-INF/services
directory. This file will simply list the FQN of the extensions we want to use:
com.andresolarte.compile.processor.HTMLAnnotationProcessor
Any jar file in the compiler classpath will be scanned by the compiler to determine if it declares one or more annotation processors.
The actual annotation processor must extend AbstractProcessor:
@SupportedSourceVersion(SourceVersion.latestSupported())
@SupportedAnnotationTypes({
"com.andresolarte.compile.processor.HTMLPrettify"
})
public class HTMLAnnotationProcessor extends AbstractProcessor {
@Override
public synchronized void init(ProcessingEnvironment env){ }
@Override
public boolean process(Set<? extends TypeElement> annotations, RoundEnvironment env) { }
}
@SupportedSourceVersion
Indicates the Java source version that the processor is compatible with.@SupportedAnnotationTypes
Specifies that the listed annotation types should be processed.init
A method called once per processor, allowing for any initialization.process
The method where the annotations are actually processed. This method can create new resources, or signal an error to the javac
process that will halt the compilation process.Lombok has become one of the most popular compiler plugins, given all of the functionality that it provides with just a few annotations.
Of special note is that compiler plugins are not supposed to change the AST (Abstract Syntax Tree). Compiler plugins can validate the resources, they can create new resources, but (in theory) they can’t change the existing code.
So how does Lombok work? It uses undocumented APIs in the HotSpot compiler and Eclipse’s JDT compiler to modify the AST.
While the functionality provided by the library is very valuable, it’s worth keeping in mind that it depends on undocumented functionality that might prevent upgrading to newer or different compilers. More discussion of these controversies can be read here.
In part two see how the behavior is actually implemented. We’ll look at Dynamic Proxies and Bytecode Manipulation.
Feel free to drop a note in the comments section.
]]>Last May I had the pleasure of presenting at Chicago Java Users Group (CJUG). The material I presented was about leveragin JMS and Spring Integration or Apache Camel to build scalable applications.
More information about the pesentation can be found here.
The slides can be found here
Find the source code here:
Below is the video of the presentation:
CJUG - 2016-05-16 - Andres Olarte on Spring Integration from Spantree Technology Group, LLC on Vimeo.
]]>The Dispatch Queue contains a set of messages that ActiveMQ has destined to be sent to a particular consumer. These messages are not available to be sent to any other consumers, unless their target consumer runs into an error (such as being disconnected). These messages are streamed to the consumer, to allow faster processing. This is also referred to as “pushing” messages to the consumer. This is in contrast to consumer polling or “pulling” messages when it’s available to process a new one.
The prefetch limit is defined by the ActiveMQ documentation as “how many messages can be streamed to a consumer at any point in time. Once the prefetch limit is reached, no more messages are dispatched to the consumer until the consumer starts sending back acknowledgements of messages (to indicate that the message has been processed)”. Basically, the prefetch limit defines the maximum number of messages to assign to the dispatch queue of a consumer. This can be seen in the following diagram (the dispatch queue of each consumer is depicted by the dotted line):
Dispatch queue with a prefetch limit of 5 and transactions enabled in the consumer
Streaming multiple messages to a client is a very significant performance boost, specially when messages can be processed quickly. Therefore the defaults are quite high:
Short.MAX_VALUE -1
)The prefetch values can be configured at the connection level, with the value reflected in all consumers using that connection. The value can be overridden in a per consumer basis. Configuration details can be found in the ActiveMQ documentation.
Normally messages are distributed somewhat evenly, but by default ActiveMQ doesn’t guarantee balanced loads between the consumers (you can plug in your own DispatchQueue policy, but in most cases that would overkill). Some cases in which the messages are unevenly distributed might be caused by:
Tuning these numbers is normally not necessary, but if messages take (or could potentially take) a significant long time to process, it might be worth the effort to tune. For example, you might want to ensure a more even balancing of message processing across multiple consumers, to allow processing in parallel. While the competing consumer pattern is very common, the ActiveMQ’s Dispatch Queue could get in your way. Particularly, one of the consumers can have all of the pending messages (up to the prefetch limit) assigned to its Dispatch Queue. This would leave other consumers idle. Such a case can be seen below:
Dispatch queue with a prefetch limit of 5 and transactions enabled in the consumer
This is normally not a big issue if messages are processed quickly. However, if the processing time of a message is significant, tweaking the prefetch limit is an option to get better performance.
While it’s a best practice to ensure your consumer can process messages very quickly, that’s not always possible. Sometimes you have to call a third party system that might be unreliable, or the business logic just keeps growing without much thought about the real world implications.
For consumers with very long processing times, or very variable processing time, it is recommended to reduce the prefetch queue. A low prefetch limit prevents messages from “backing up” in the dispatch queue, earmarked for a consumer that is busy:
Dispatch queue with a prefetch limit of 5 and transactions enabled in the consumer
This behavior can be seen in the ActiveMQ console with a symptom most people describe as “stuck” messages, even though some of the consumers are idle. If this is the case, it’s worth examining the consumers:
The “Active Consumers” view can help shed light into what is actually happening:
Screenshot showing a consumer with 5 messages in its Dispatch Queue. The prefetch limit is set at 5 for this consumer.
To address the negative effects of such cases, a prefetch limit of 1 will ensure maximum usage of all available consumers:
Dispatch queue with a prefetch limit of 1 and transactions enabled in the consumer
This will negate some of the efficiencies of streaming a large number of messages to a consumer, but this is negligible in cases where processing each message takes a long time.
When the consumer is set to use Session.AUTO_ACKNOWLEDGE
, the consumer will automatically acknowledge a message as soon as it receives it, and then it will start actually processing the message. In this scenario, ActiveMQ has no idea if the consumer is consumer is busy processing a message or not, and will therefore not take that message into account for the dispatch queue delivery. Therefore, it’s possible for a SECOND message to be queued for a busy consumer, even if there is another consumer idle:
Dispatch queue with a prefetch limit of 1 and auto acknowledge enabled
If Consumer 1 takes a long time processing its message, the second message could will take a long time to even start being process. This could have significant impact on the performance issues. Normal troubleshooting might a few discrepancies
How can we this situation be prevented? For such cases, one option is to disable the dispatch queue altogether, by setting the prefetch limit to zero. This cause force consumers to have to fetch a message every time they’re idle, instead of waiting for messages to be pushed to them. This will further degrade performance of the JMS delivery, so it should be used will care. However this will ensure that all available consumer are kept busy:
Consumer with no dispatch queue (prefetch limit set to zero)
While the prefetch limit default is good enough for most applications, a good understanding of what happening under the covers can go a long way in tuning a system.
]]>Continuing the example shown in part1 we now add a new feature, the Claim Check. This is the Apache Camel / CDI equivalent of another Spring Integration example shown before. This pattern, along with most of the groundwork used in enterprise integration, was first codified in “Enterprise Integration Patterns” by Gregor Hohpe and Bobby Woolf. This book is a must read for any one working in this area, regardless of the tool.
In this example we add the necessary pieces to store data in an external data store while our original message is route throughout our system. The main motivation is to avoid sending and receiving large amounts of data through JMS or other similar systems that are designed to handle low latency communication. Contrary to Spring Integration, Apache Camel does not offer an out of the box implementation of the the Claim Check pattern. The pattern and how to implemented are described here.
There are talks of introducing one for Apache Camel 3.0, but reading the discussions it becomes evident that providing a flexible general case implementation is complicated. Spring Integration provides an implementation, but many times it’s insufficient, as it becomes evident that it’s necessary to only store parts of the message (most likely binary portions of the message), while keeping most of the message to be processed along the pipeline. The general flow of this pattern can be seen below:
To implement the Claim Check pattern, five different new classes were created. The first two classes provide the actual implementation of the two pieces that make up the Claim Check pattern.
The two pieces are in essence a Content Filter that removes part of the content (and stores it in separate database), and a Content Enricher that will pull the data from database and back into the message. The code is seen below:
The data store provides a very simple abstraction to put, get, and delete objects from the database. This current implementation is using JDBC, but shows the basics of how a data store can be implemented:
The last two new classes, H2ServerWrapper
and DataSourceProvider
, provide the connections to the database, as well as starting the in memory H2 database. Theses classes are normally not needed, since connections are handled by the application server in most Java EE applications. There were added here to provide similar functionality without an application server. You’re welcome to look at the code, but in future post I’ll explain in more detail how to provide these services from inside CDI.
Once we have our classes implementing the functionality, the wiring is very simple:
All it takes is adding one destination before and one after. In Camel, declaring multiple destinations will create a pipeline. This means that the message will be sent to the first destination, then the result of the first destination will be sent to the second destination, and so on. This in contrast to multicast, which will send the same message to all of the destinations.
The first change is the addition of bean:claimCheckOut
, after jms:queue:order
. What this does is to check out the data and putting it back on the message once the response arrives. Something similar is done on the server side. First we process the request using the orderServiceHandler
bean, and then put the response in the data store. This particular configuration is using the check in for only the response.
It’s fairly easy to change it to use the claim check for both the request and the response. It’s worth noting that such cases are not common in practice, as normally either the response or the request are large.
In this case, request is also stored. Since our check in method stores the whole object, this will an instance of BeanInvokation. This method contains the parameters and other information need to invoke the service method. Based on your needs, you could of course only extract part of this object, for example one or more of the parameters, while keeping the rest of the message intact.
The following sections compare mostly equivalent Spring Integration XML to their Apache Camel DSL. I have not included the implementation of the extra infrastructure pieces which Apache Camel requires.
vs
vs
From this short tutorial, it’s evident that Apache Camel requires quite a bit extra code to accomplish the same tasks we did with Spring Integration. However, most of it becomes irrelevant if running inside an application server, which is the main target for CDI. The biggest piece that I feel should be provided by Apache Camel is a set of data stores using common data sources such as JDBC, JPA, and some of the more popular NoSQL databases. However, once that hurdle is overcome, Apache Camel shines in the ease in which complex problems can be solved in clear concise and easy to maintain code. So which one is better? A lot of it depends on what framework you’re already using (Spring vs CDI), and if any of the integration platforms provides a particular esoteric feature. But in the end, both tools are very capable and easy to use.
You can find the source code for this example in github.
To check out the code clone the following repository: https://github.com/aolarte/camel-integration-samples.git
.
git clone https://github.com/aolarte/camel-integration-samples.git
git checkout branches/part2
This example can be run directly from maven. For example :
mvn exec:java -Dexec.mainClass="com.javaprocess.examples.integration.main.Main" -Dexec.args="server client"
The main business logic of the application is an almost exact copy of the Spring Integration example (Application.java
, OrderServiceHandler.class
, etc…). The only difference is in the annotations, since we’re using JSR-299 CDI annotations like @Inject
and @Produces
.
The application has a “client” portion that generates messages that are serviced by a bean. This bean is either located in the same JVM, or remotely accesible through a JMS queue. The objective of this example is to show how easy it is to use this pattern create distributed services that enable spreading particularly resource intensive operations to backend nodes. This is the same operation of the Spring Integration post mentioned previously, therefore the flow can be shown using the exact same diagrams as in previous example:
The main routing logic is found in CamelContext.java
.
For anyone used to Spring Integration, one of the most obvious things that jumps out is the abundance of code to wire the Camel and CDI components together. On the other hand, there is basically no XML. The only XML file (beans.xml
) is basically empty.
CDI favors using code to wire components. This is done mostly through annotations, and simple producer methods (which are annotated with @Produces
). This is philosophically very different from Spring. Camel does provide a way to configure its routes using XML, but it requires Spring, so it’s not really viable with CDI.
The general format of a Camel route is to define a source (using RouteBuilder.from()
) and a destination (using RouteBuilder.to()
). RouteBuilder
provides a fluent API, and extra parameters can be set, for example the Exchange Pattern, which is needed to get a response back.
The format of uri passed to the from and to methods is always prefixed by the component name.
For example the following definition will route requests from a direct endpoint named order
, to a bean named orderServiceHandler
:
In this case we use the direct
component which defines an internal connection, and the jms
component. The bean
component provides a container agnostic way to access beans in Dependency Injection container.
In this case we’re including the “camel-cdi” artifact, which allows Camel to locate beans in a JSR-299 (CDI) container such as Weld. Similar modules exist for other containers such as Spring, or a manual Register may be manually maintained.
The uri also includes the destination name, as well as extra parameters. For example we can define that 5 concurrent listeners (threads) will consume from a single queue:
Camel can be hooked into CDI to provide beans that can be consumed by using @Produces
. This method will produce objects of type IOrderService
. The resulting object will be a proxy that will be backed by the “direct:order” endpoint.
A bit of extra logic was change to determine which routes to add based on the functionality desired by the user and passed as command line parameters. In the Spring example this was controlled based on which files were to be included.
This does limit the flexibility of the solution, since there are no XML files to tweak after building the application.
It worth noting that CDI only supports mapping one method. In practice the interface can have more than one method, but the invocation of any of those methods will be sent to same endpoint.
More sophisticated manipulation can be done, but in my experience it’s not worth the effort unless absolutely necessary.
Other functions that were defined in XML in the Spring example are now done in code, for example starting an embedded ActiveMQ JMS Broker and JMS connection factory. These are handled inside JMSConnectionFactoryProvider
, but in most cases will be handled by a Java EE container.
This example can be run in the same manner as the Spring Integration one. The code must first be compiled:
mvn compile
Then it can be run from Maven, passing one (or two) of three parameters. The three possible parameters are:
direct
: Uses a direct connection between the client and the bean handling the requestsserver
: Starts a server that will monitor an ActiveMQ queue, and process any requests sent to it. This option will also start an embedded ActiveMQ server, which prevents more than one server from running at once. If using an external ActiveMQ server, no such restriction exists.client
: Start the distributed client, which will send and receive requests through ActiveMQ. This can be used For example :
mvn exec:java -Dexec.mainClass="com.javaprocess.examples.integration.main.Main" -Dexec.args="server client"
Will result in the following output:
Running in client mode
Requesting order processing on thread: 38
Requesting order processing on thread: 41
Requesting order processing on thread: 39
Requesting order processing on thread: 40
Requesting order processing on thread: 42
Got order with id 100 on thread: 33
Got order with id 100 on thread: 32
Got order with id 100 on thread: 30
Got order with id 100 on thread: 29
Got order with id 100 on thread: 31
Order was requested by 40 and by processed by thread: 33
Order was requested by 42 and by processed by thread: 32
Order was requested by 41 and by processed by thread: 29
Order was requested by 39 and by processed by thread: 31
Order was requested by 38 and by processed by thread: 30
Stop Camel
The following sections compare mostly equivalent Spring Integration XML to their Apache Camel DSL.
vs.
vs.
vs.
vs.
I’m a big fan of CDI to wire Java EE applications. However, for wiring standalone applications, CDI is lacking. For a small application I would rather go with something simpler like Google Guice.
However, to show a comparable application to the standalone Spring Integration example, I have used Weld, the reference implementation for CDI and one of the most popular implementations out there.
One particular challenge was starting a bean eagerly. If running inside a Java EE container, this could have achieved with a single annotation @Startup
. It is in cases like this that it becomes obvious that CDI is meant to compliment Java EE. However, for my standalone example I had to implement an extension that achieved this behavior. While CDI provides the way to do it, it still not ideal. More information on how this is achieved can be seen in this detailed post.
I hope that this short post has shown the value the Enterprise Integration Patterns, regardless of the implementation. Both Apache Camel and Spring Integration provides a rich set of the Enterprise Integration Patterns, which can be leveraged to solve complex real world problems.
You can find the source code for this example in github.
To check out the code clone the following repository: https://github.com/aolarte/camel-integration-samples.git
.
git clone https://github.com/aolarte/camel-integration-samples.git
MessageBodyWritter
as shown below:
package csv;
import com.fasterxml.jackson.dataformat.csv.CsvMapper;
import com.fasterxml.jackson.dataformat.csv.CsvSchema;
import javax.ws.rs.Produces;
import javax.ws.rs.WebApplicationException;
import javax.ws.rs.core.MediaType;
import javax.ws.rs.core.MultivaluedMap;
import javax.ws.rs.ext.MessageBodyWriter;
import javax.ws.rs.ext.Provider;
import java.io.IOException;
import java.io.OutputStream;
import java.lang.annotation.Annotation;
import java.lang.reflect.Type;
import java.util.List;
@Provider
@Produces("text/csv")
public class CSVMessageBodyWritter implements MessageBodyWriter {
@Override
public boolean isWriteable(Class type, Type genericType, Annotation[] annotations, MediaType mediaType) {
boolean ret=List.class.isAssignableFrom(type);
return ret;
}
@Override
public long getSize(List data, Class aClass, Type type, Annotation[] annotations, MediaType mediaType) {
return 0;
}
@Override
public void writeTo(List data, Class aClass, Type type, Annotation[] annotations, MediaType mediaType, MultivaluedMap multivaluedMap, OutputStream outputStream) throws IOException, WebApplicationException {
if (data!=null && data.size()>0) {
CsvMapper mapper = new CsvMapper();
Object o=data.get(0);
CsvSchema schema = mapper.schemaFor(o.getClass()).withHeader();
mapper.writer(schema).writeValue(outputStream,data);
}
}
}
To use our MessageBodyWriter
, it must be registered. This can achieved in several ways, depending on your JAX-RS implementation. Normally Jersey and other JAX-RS implementations are configured to scan packages and look for resources. In such cases classed marked with @Provider
will be registered automatically. In other cases, the registration will have to be done manually. For example in Dropwizard, you have to manually register the Writer at startup:
@Override
public void run(MyConfiguration configuration,
Environment environment) {
environment.jersey().register(new CSVMessageBodyWritter());
}
Once registered, it becomes trivial to have a web service that outputs CSV. It’s just a matter of annotating your webservice with the same media type as the MessageBodyWritter
: @Produces("text/csv")
in this case.
import javax.ws.rs.GET;
import javax.ws.rs.Path;
import javax.ws.rs.Produces;
import java.util.List;
@Path("/status")
public class StatusResource {
@GET
@Produces("text/csv")
public List<Data> getData() {
List<Data> data= service.getStatus();
return data;
}
}
Our data class is just a normal POJO:
public class Data {
private String date;
private Integer minimum;
private Integer maximum;
private Integer average;
public Data() {
}
//Getters and setters as needed
}
The output will look like this:
average,date,maximum,minimum
90,3/1856,125,0
60,2/1856,115,16
60,4/1856,115,16
This is a very simple implementation, but should be a good starting point. It’s worth nothing that CSV is inherently limited, and can’t easily represent hierarchical object graphs. Therefore you might need flatten your data before exporting to CSV. If you need to ingest a CSV in a webservice, you can follow a similar approach to create a MessageBodyReader
that will create an object from a CSV stream.