High-frequency trading in a limit order book

market maker

Fortunately, the stochastic control theory helps to handle such kind of optimization problem by seeking an optimal strategy in order to maximize the trader’s objective function and to face a dyadic problem for the high-frequency trading. The theory encourages the study of optimizing activities in financial markets as it allows to accomplish the complex optimization problems involving constraints that are consistent with the price dynamics while managing the inventory risk. In order to detect the optimal quotes in the market, it is, therefore, necessary to solve the corresponding nonlinear Hamilton-Jacobi-Bellman equation for the optimal stochastic control problem. This is generally achieved by applying various root-finding algorithms that can handle the complexity and high-dimensionality of the equation.

If they’ll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible — no later than 48 hours after receiving the formal acceptance. PLOS authors have the option to publish the peer review history of their article (what does this mean?).

Figures and Tables from this paper

In this, the most time-consuming step of the backtest process, our algorithms learned from their trading environment what AS model parameter values to choose every five seconds of trading (in those 5 seconds; see Section 4.1.3). Where tj is the current time upon arrival of the jth market tick, pm is the current market mid-price, I is the current size of the inventory held, γ is a constant that models the agent’s risk aversion, and σ2 is the variance of the market midprice, a measure of volatility. Table12 obtained from all simulations illustrates that the traders using the Model c have relatively higher return but also relatively a higher standard deviation comparing to other models. The performances of Sharpe ratios of each models indicates that the stock price models with stochastic volatility based on a quadratic utility function produces more attractive portfolios than the other models.

Stay informed on the latest trending ML avellaneda & stoikovs with code, research developments, libraries, methods, and datasets. If you want to end the trading session with your entire inventory allocated to USDT, you set this value to 0. Starting with the strategy name, you have to enter avellaneda_market_making to use this new strategy. After that, use config order_book_depth_factor and config risk_factor to set your custom values. On hummingbot, you choose what the asset inventory target is, and the bot calculates the value of q. This parameter is used to calculate what is the difference between the current inventory position and the desired one.

What is the optimal spread?

3 with stock price dynamics as “Model 1” and the model with the dynamics “Model 2”. While we do not change the rest of the parameters in Table1 and we observe our expectations in solutions which can be tracked by Table8, in coherence with . While keeping the other parameters same as in the Table1, our above expectation matches with the solutions obtained and be seen Table7. Increases as the trader expects the price to move up, she sends the orders at higher prices to get profit from the price increase which meets with our expectation. On the other hand, the results show that our strategy has a lower standard deviation.

The cumulative profit resulting from a market maker’s operations comes from the successive execution of trades on both sides of the spread. This profit from the spread is endangered when the market maker’s buy and sell operations are not balanced overall in volume, since this will increase the dealer’s asset inventory. The larger the inventory is, be it positive or negative , the higher the holder’s exposure to market movements. Hence, market makers try to minimize risk by keeping their inventory as close to zero as possible. Market makers tend to do better in mean-reverting environments, whereas market momentum, in either direction, hurts their performance. Random forest is an efficient and accurate classification model, which makes decisions by aggregating a set of trees, either by voting or by averaging class posterior probability estimates.

This is an efficient way of arriving at quasi-optimal values for these parameters given the market environment in which the agent begins to operate. From this point, the RL agent can gradually diverge as it learns by operating in the changing market. We were able to achieve some parallelisation by running five backtests simultaneously on different CPU cores. Upon finalization of the five parallel backtests, the five respective memory replay buffers were merged. 10 such training iterations were completed, all on data from the same full day of trading, with the memory replay buffer resulting from each iteration fed into the next. The replay buffer obtained from the final iteration was used as the initial one for the test phase.

Applied Mathematical Finance

Thus, the DQN approximates a Q-learning function by outputting for each input state, s, a vector of Q-values, which is equivalent to checking the row for s in a Qs,a matrix to obtain the Q-value for each action from that state. A discount factor (γ) by which future rewards are given less weight than more immediate ones when estimating the value of an action (an action’s value is its relative worth in terms of the maximization of the cumulative reward at termination time). To maximize trade profitability, spreads should be enlarged such that the expected future value of the account is maximized.

The Avellaneda-Stoikov procedure underpinning the market-making actions in the models under discussion is explained in Section 2. Section 3 provides an overview of reinforcement learning and its uses in algorithmic trading. The deep reinforcement learning models (Alpha-AS-1 and Alpha-AS-2) developed to work with the Avellaneda-Stoikov algorithm are presented in detail in Section 4, together with an Avellaneda-Stoikov model (Gen-AS) without RL with parameters obtained with a genetic algorithm.

On this performance indicator, AS-Gen was the overall best performing model, winning on 11 days. The mean Max DD for the AS-Gen model over the entire test period was visibly the lowest , and its standard deviation was also the lowest by far from among all models. In comparison, both the mean and the standard deviation of the Max DD for the Alpha-AS models were very high. Indeed, the differences in Max DD performance between Gen-AS and either of the Alpha-AS models, over all test days, are not statistically significant, despite the large differences in means. The latter are a result of extreme outliers for the Alpha-AS models from days in which these obtained a very poor (i.e., high) value for Max DD. The medians, however, are very similar to the median for the Gen-AS model. Mann-Whitney tests comparing the four daily performance indicator values (Sharpe, Sortino, Max DD and P&L-to-MAP) obtained for the Gen-AS model with the corresponding values obtained for the other models, over the 30 test days.


Several studies combine multi-view clustering with binary code learning for improving clustering performance. However, there is much redundant information contained in the learned binary codes, which negatively affects the clustering performance, but these studies ignore eliminating redundant information for learning compact codes. In addition, they don’t give a unified (one-step) clustering framework with binary graph structure, which doesn’t lead to the optimal clustering result due to the information loss during the two-step process. Furthermore, we design an effective optimization algorithm based on alternating direction minimization to solve the model of OMBG. Extensive experiments performed on four frequently-used benchmark multi-view datasets illustrate the superiority of OMBG which is compared with some state-of-the-art clustering baselines.

For mature avellaneda & stoikovs, such as the U.S. and Europe, the real-time LOB is event-based and updates at high speed of at least milliseconds and up to nanoseconds. The dataset from the Nasdaq Nordic stock market in Ntakaris et al. contains 100,000 events per stock per day, and the dataset from the London Stock Exchange in Zhang et al. contains 150,000. In contrast, exchanges in the Chinese A-share market publish the level II data, essentially 10-level LOB, every three seconds on average, with 4500–5000 daily ticks. This snapshot data provides us with the opportunity to leverage the longer tick-time interval and make profits using machine learning algorithms.

On the P&L-to-MAP ratio, Alpha-AS-1 was the best-performing model for 11 test days, with Alpha-AS-2 coming second on 9 of them, whereas Alpha-AS-2 was the best-performing model on P&L-to-MAP for 16 of the test days, with Alpha-AS-1 coming second on 14 of these. Here the single best-performing model was Alpha-AS-2, winning for 16 days and coming second on 10 (on 9 of which losing to Alpha-AS-1). Alpha-AS-1 had 11 victories and placed second 16 times (losing to Alpha-AS-2 on 14 of these). AS-Gen had the best P&L-to-MAP ratio only for 2 of the test days, coming second on another 4. The mean and the median P&L-to-MAP ratio were very significantly better for both Alpha-AS models than the Gen-AS model.

Journal of Economic Dynamics and Control

Then, a robust sparse-norm and graph regularization constraints are performed in the objective function to ensure the consistency of the spatial information. For the optimization of the parameters involved in the model, a distributed adaptive proximal Newton gradient descent learning strategy is proposed to accelerate the convergence. Furthermore, considering the dynamic time-series and potentially non-stationary structure of industrial data, we propose extended incremental versions to alleviate the complexity of the overall model computation. Extensive data recovery experiments are conducted on two real industrial processes to evaluate the proposed method in comparison with existing state-of-the-art restorers. The results show that the proposed methods can impute better with different missing rates and have strong competitiveness in practical application.

how do you mine litecoin

The market-maker can post competitive bid and ask prices that improves on the current market price in order to manage the inventory. Optimal strategies for market makers have been studied by academic researchers for a very long time now, with Thomas Ho and Hans Stoll starting to write about market dealers dynamics in 1980. The reasoning behind this parameter is that, as the trading session is getting close to an end, the market maker wants to have an inventory position similar to when the one he had when the trading session started.

  • The results obtained suggest avenues to explore for further improvement.
  • Moreover, all the solutions that the scientific community has proposed are static, i.e., the system’s behavior does not change as boundary conditions change.
  • In contrast, exchanges in the Chinese A-share market publish the level II data, essentially 10-level LOB, every three seconds on average, with 4500–5000 daily ticks.
  • An ε-greedy policy is followed to determine the action to take during the next 5-second window, choosing between exploration , with probability ε, and exploitation , with probability 1-ε.
  • Furthermore, in case of the jumps in volatility, it is observed that a higher profit can be obtained but with a larger standard deviation.

So, as the trading session is getting closer to the end, order spreads will be smaller, and the reservation price position will be more “aggressive” on rebalancing the inventory. You might have noticed that I haven’t added volatility(σ) on the main factor list, even though it is part of the formula. That is because volatility value depends on the market price movement, and it isn’t a factor defined by the market maker. If the market volatility increases, the distance between reservation price and market mid-price will also increase. But this kind of approach, depending on the market situation, might lead to market maker inventory skewing in one direction, putting the trader in a wrong position as the asset value moves against him.


I consider these issues to be major in nature, requiring more than a superficial or minor revision. In particular, there are important deficiencies in the methodological section that seriously hinder the understanding of the work as well as the results obtained. Their robustness is also unclear, so I have doubts as to whether the conclusions are supported by the results presented. If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. No significant differences were found between the two Alpha-AS models.

These https://www.beaxy.com/s, therefore, must learn everything about the problem at hand, and the learning curve is steeper and slower to surmount than if relevant available knowledge were to be leveraged to guide them. Figure3 depicts one simulation of the profit and loss function of the market maker at any time t during the trading session in the left panel. The profit and loss performance of the trading is displayed by the cash level histogram in the left panel.

Mean decrease impurity , a feature-specific measure of the mean reduction of weighted impurity over all the nodes in the tree ensemble that partition the data samples according to the values of that feature . Where the 0 subscript denotes the best orderbook price level on the ask and on the bid side, i.e., the price levels of the lowest ask and of the highest bid, respectively. Market indicators, consisting of features describing the state of the environment.

Enel Américas begins sale of assets in Argen… – BNamericas English

Enel Américas begins sale of assets in Argen….

Posted: Fri, 17 Feb 2023 08:00:00 GMT [source]

Α is the learning rate (α∈), which reduces to a fraction the amount of change that is applied to Qi from the observation of the latest reward and the expectation of optimal future rewards. This limits the influence of a WAVES single observation on the Q-value to which it contributes. @RRG Right, this makes sense that the market-maker can place quotes improving on the current midprice. So I guess the fact that the plot in the original paper does not show crossing between the quotes of the market-maker and the midprice is just a matter of coincidence. However, this situation does not need to happen, so there is no guarantee he will set prices compatible with current market prices. Closing_time – Here, you set how long each “trading session” will take.

Wireless ad hoc networks are infrastructureless networks and are used in various applications such as habitat monitoring, military surveillance, and disaster relief. Data transmission is achieved through radio packet transfer, thus it is prone to various attacks such as eavesdropping, spoofing, and etc. Monitoring the communication links by secure points is an essential precaution against these attacks. Also, deploying monitors provides a virtual backbone for multi-hop data transmission. However, adding secure points to a WANET can be costly in terms of price and time, so minimizing the number of secure points is of utmost importance. Graph theory provides a great foundation to tackle the emerging problems in WANETs.

Deja una respuesta

Your email address will not be published.

You may use these <abbr title="HyperText Markup Language">HTML</abbr> tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>