String.Join in Excel

Published Nov 9, 2022

There is a common task in Excel that seems should have a very simple solution. Alas, when googling for it you get all these inexplainable crappy "tutorial" sites that either show you something completely different or something that you cannot actually do because you don't have the latest version of Office. Well, enough of this!

The task I am talking about is just selecting a range of values and concatenating them using a specified separator, what in a programming language like C# is string.Join or in JavaScript you get the array join function. I find it very useful when, for example, I copy a result from SQL and I want to generate an INSERT or UPDATE query. And the only out of the box solution is available for Office 365 alone: TEXTJOIN.

You use it like =TEXTJOIN(", ", FALSE, A2:A8) or =TEXTJOIN(", ", FALSE, "The", "Lazy", "Fox"), where the parameters are:

a delimiter
a boolean to determine if empty cells are ignored
a series or text values or a range of cells

But, you can have this working in whatever version of Excel you want by just using a User Defined Function (UDF), one specified in this lovely and totally underrated Stack Overflow answer: MS Excel - Concat with a delimiter.

Long story short:

open the Excel sheet that you want to work on
press Alt-F11 which will open the VBA interface
insert a new module
paste the code from the SO answer (also copy pasted here, for good measure)
press Alt-Q to leave
if you want to save the Excel with the function in it, you need to save it as a format that supports macros, like .xlsm

And look at the code. I mean, it's ugly, but it's easy to understand. What other things could you implement that would just simplify your work and allow Excel files to be smarter, without having to code an entire Excel add-in? I mean, I could just create my own GenerateSqlInsert function that would handle column names, NULL values, etc.

Here is the TEXTJOIN mimicking UDF to insert in a module:

Function TEXTJOIN(delim As String, skipblank As Boolean, arr)
    Dim d As Long
    Dim c As Long
    Dim arr2()
    Dim t As Long, y As Long
    t = -1
    y = -1
    If TypeName(arr) = "Range" Then
        arr2 = arr.Value
    Else
        arr2 = arr
    End If
    On Error Resume Next
    t = UBound(arr2, 2)
    y = UBound(arr2, 1)
    On Error GoTo 0

    If t >= 0 And y >= 0 Then
        For c = LBound(arr2, 1) To UBound(arr2, 1)
            For d = LBound(arr2, 1) To UBound(arr2, 2)
                If arr2(c, d) <> "" Or Not skipblank Then
                    TEXTJOIN = TEXTJOIN & arr2(c, d) & delim
                End If
            Next d
        Next c
    Else
        For c = LBound(arr2) To UBound(arr2)
            If arr2(c) <> "" Or Not skipblank Then
                TEXTJOIN = TEXTJOIN & arr2(c) & delim
            End If
        Next c
    End If
    TEXTJOIN = Left(TEXTJOIN, Len(TEXTJOIN) - Len(delim))
End Function

Hope it helps!

Updated blog structure

Published Nov 8, 2022

and has 0 comments

I got lazy about the blog and it shows. I rarely write about anything technical and the state of the existing posts is pretty poor. I want to improve this, but I don't have a lot of time, I would certainly appreciate feedback on what you find fixable (in general or for particular posts) or new features that you might like.

I've updated the file structure of the blog, working mostly on fixing the comments. It should now be easier to structured text in a comment, like HTML, code, links, etc. Also tried to fix the old comments (there were three different sources of comments until now, with some duplicated)

Please let me know if you see issues with adding comments or reading existing comments. Or perhaps missing comments.

I've added the link to a post you can comment in for general issues (see the top right section of the blog, the last icon is for comments )

Thanks!

Fast fuzzy searching of a string in a list

Published Nov 8, 2022

and has 0 comments

I haven't been working on the Sift string distance algorithm for a while, but then I was reminded of it because someone wanted it to use it to suggest corrections to user input. Something like Google's: "Did you mean...?" or like an autocomplete application. And it got me thinking of ways to use Sift for bulk searching. I am still thinking about it, but in the meanwhile, this can be achieved using the Sift4 algorithm, with up to 40% improvement in speed to the naïve comparison with each item in the list.

Testing this solution, I've realized that the maxDistance parameter did not work correctly. I apologize. The code is now fixed on the algorithm's blog post, so go and get it.

So what is this solution for mass search? We can use two pieces of knowledge about the problem space:

the minimum possible distance between two string of length l1 and l2 will always abs(l1-l2)
- it's very easy to understand the intuition behind it: one cannot generate a string of size 5 from a string of size 3 without at least adding two new letters, so the minimum distance would be 2
as we advance through the list of strings, we have a best distance value that we keep updating
- this molds very well on the maxDistance option of Sift4

Thus armed, we can find the best matches for our string from a list using the following steps:

set a bestDistance variable to a very large value
set a matches variable to an empty list
for each of the strings in the list:
1. compare the minimum distance between the search string and the string in the list (abs(l1-l2)) to bestDistance
  1. if the minimum distance is larger than bestDistance, ignore the string and move to the next
2. use Sift4 to get the distance between the search string and the string in the list, using bestDistance as the maxDistance parameter
  1. if the algorithm reaches a temporary distance that is larger than bestDistance, it will break early and report the temporary distance, which we will ignore
3. if distance<bestDistance, then clear the matches list and add the string to it, updating bestDistance to distance
4. if distance=bestDistance, then add the string to the list of matches

When using the common Sift4 version, which doesn't compute transpositions, the list of matches is retrieved 40% faster on average than simply searching through the list of strings and updating the distance. (about 15% faster with transpositions) Considering that Sift4 is already a lot faster than Levenshtein, this method will allow searching through hundreds of thousands of strings really fast. The gained time can be used to further refine the matches list using a slower, but more precise algorithm, like Levenshtein, only on a lot smaller set of possible matches.

Here is a sample written in JavaScript, where we search a random string in the list of English words:

search = getRandomString(); // this is the search string
let matches=[];             // the list of found matches
let bestDistance=1000000;   // the smaller distance to our search found so far
const maxOffset=5;          // a common value for searching similar strings
const l = search.length;    // the length of the search string
for (let word of english) {
    const minDist=Math.abs(l-word.length); // minimum possible distance
    if (minDist>bestDistance) continue;    // if too large, just exit
    const dist=sift4(search,word,maxOffset,bestDistance);
    if (dist<bestDistance) {
        matches = [word];                  // new array with a single item
        bestDistance=dist;
        if (bestDistance==0) break;        // if an exact match, we can exit (optional)
    } else if (dist==bestDistance) {
        matches.push(word);                // add the match to the list
    }
}

There are further optimizations that can be added, beyond the scope of this post:

words can be grouped by length and the minimum distance check can be done on entire buckets of strings of the same lengths
words can be sorted, and when a string is rejected as a match, reject all string with the same prefix
- this requires an update of the Sift algorithm to return the offset at which it stopped (to which the maxOffset must be added)

I am still thinking of performance improvements. The transposition table gives more control over the precision of the search, but it's rather inefficient and resource consuming, not to mention adding code complexity, making the algorithm harder to read. If I can't find a way to simplify and improve the speed of using transpositions I might give up entirely on the concept. Also, some sort of data structure could be created - regardless of how much time and space is required, assuming that the list of strings to search is large and constant and the number of searches will be very big.

Let me know what you think in the comments!

The necessity and unavoidable evolution of satire

Published Nov 4, 2022

Posted in
misc
essay

and has 0 comments

Have you ever heard the saying "imitation is the sincerest form of flattery"? It implies that one copying another values something in the other person. But often enough people just imitate what they want, they pick and choose, they imitate poorly or bastardize that which they imitate. You may imitate the strategy a hated opponent uses against you or make a TV series after books that you have never actually read. I am here to argue that satire cannot be misused like that.

Remember when Star Trek: Lower Decks first appeared? The high speed spoken, meme driven, filled with self deprecating jokes, having characters typical to coastal cities of the United States and, worse of all, something that made fun of Star Trek? After having idiots like J. J. Abrams completely muddle the spirit of Trek, now come these coffee drinking groomed beard bun haired hipsters to destroy what little holliness is left! People were furious! In fact, I remember some being rather adamant that The Orville is an unacceptable heresy on Star Trek.

Yet something happened. Not immediately, it took a few episodes, sometimes a season, for the obvious jokes to be made, the frustrations exhausted, for characters to grow. And then there is was: true Star Trek, with funny characters following the spirit of the original concept. No explosions, no angry greedy violent people imposing their culture over the entire universe, but rather explorers of the unknown, open to change and new experiences, navigating their own flaws as humans in a universe larger than comprehension. And also honest and funny!

It was the unavoidable effect of examining something thoroughly for a longer period of time. One has to understand what they satirize in order to make it good. Not just skim the base ideas, not just reading the summaries of others. Satire must go deep into the core of things, the causes not just the effects, the expressions, the patterns, the in-jokes. Even when you are trying to mock something you hate, like a different ideology or political and religious belief, you can only do it for a really short time or become really bad at what you are doing, a sad caricature to which people just as clueless as you are attempting to disguise anger by faking amusement. If you do it well and long enough, every satire makes you understand the other side.

Understanding something does not imply accepting it, but either accepting or truly fighting something requires understanding. You want a tool to fight divisiveness, this artificial polarization grouping people into deaf crowds shouting at each other? That's satire! The thing that would appeal to all sides, for very different reasons, yet providing them with a common field on which to achieve communication. If jokes can diffuse tension in an argument between two people, satire can do that between opposing groups.

And it works best with fiction, where you have to create characters and then keep them alive as they develop in the environment you have created, but not necessarily. I've watched comedians making political fun of "the other side" for seasons on end. They lost me every time when they stopped paying attention and turned joke to insult, examination to judgement. But before that, they were like a magnifying glass, both revealing and concentrating heat. At times, it was comedians who brought into rational discussion the most serious of news, while the news media was wallowing in political messaging and blind allegiance to one side or the other. When there is no real journalism to be found, when truth is hidden, polluted or discouraged, it is in jokes that we continue the conversation.

So keep it coming, the satire, the mocking, the ridicule. I want to see books like Harry Potter and the Methods of Rationality, shows like Big Mouth and The Orville and ST: Lower Decks, movies like Don't Look Up! Give me low budget parodies of Lovecraft and Tolkien and James Bond and Ghost Busters and Star Wars and I guarantee you than by the second season they will be either completely ignored by the audience and cancelled or better than the "official" shows, for humor requires a sharp wit and a clear view of what you're making fun of.

Open your eyes and, if you don't like what you see, make fun of it! Replace shouting with laughter, outrage with humor, indifference with amusement.

Temporary tables better than table variables!

Published Oct 27, 2022

and has 0 comments

Today I had a very interesting discussion with a colleague who optimized my work in Microsoft's SQL Server by replacing a table variable with a temporary table. Which is annoying, since I've done the opposite plenty of time, thinking that I am choosing the best solution. After all, temporary tables have the overhead of being stored into tempdb, on the disk. What could possibly be wrong with using a table variables? I believe this table explains it all:

First of all, the storage is the same. How? Well, table variables start off in memory, but if they go above a limit they get saved to tempdb! Another interesting bit is the indexes. While you can create primary keys on table variables, you can't use other indexes - that's OK, though, because you would hardly need very complex variable tables. But then there is the parallelism: none for table variables! As you will see, that's rather important. At least table variables don't cause recompilations. And last, but certainly not least, perhaps the most important difference: statistics! You don't have statistics on table variables.

Let's consider my scenario: I was executing a stored procedure and storing the selected values in a table variable. This SP had the single reason to filter the ids of records that I would then have to extract - joining them with a lot of other tables - and could return 200, 800 or several hundred thousand rows.

With a table variable this means :

when inserting potentially hundreds of thousands of rows I would have no parallelism (slow!) and it would probably save it to tempdb anyway (slow!)
when joining other tables with it, not having statistics, it would just treat it like a short list of values, which it potentially wasn't, and looping through it : Table Spool (slow!)
various profiling tools would show the same or even less physical reads and the same SQL server execution time, but the CPU time would be larger than execution time (hidden slow!)

This situation has been improved considerably in SQL Server 2019, to the point that in most cases table variables and temporary tables show the same performance, but versions previous to that would show this to a larger degree.

And then there are hacks. For my example, there is reason why parallelism DOES occur:

So are temporary tables always better? No. There are several advantages of table variables:

they get cleared automatically at the end of their scope
result in fewer recompilations of stored procedures
less locking and resources, since they don't have transaction logs

For many simple situations, like where you want to generate some small quantity of data and then work with that, table variables are best. However, as soon as the data size or scenario complexity increases, temporary tables become better.

As always, don't believe me, test! In SQL everything "depends", you can't rely on fixed rules like "X is always better" so profile your particular scenarios and see which solution is better.

Hope it helps!

Windows start menu search not working, search textbox not accepting input

Published Oct 22, 2022

Posted in
misc
software

and has 0 comments

Symptoms:

You press the Start key (or open the Start menu) and you type something expecting to filter items
Instead nothing happens
You click on the Windows 11 search textbox in the Start menu, but it takes no input
You get notifications in the tray bar, but if you click on them, like on Windows Updates, it doesn't open Windows Updates

I searched and searched. No, it wasn't running ctfmon. No, it wasn't restarting anything. No, I didn't want to reinstall Windows all over again.

What it was was removing/renaming the IrisService Computer\HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\IrisService registry entry and restarting the computer. I thank Tech Advisor for this, although the problem wasn't that there were no answers on the net, but that there were too many and the vast majority of them were wrong and inept.

Searching further, with the solution in hand, I found this little link: Windows 11 bug, so it is indeed an issue coming from Microsoft itself (probably update KB5006050). That should probably teach me against installing optional updates on my computer.

The funny thing is that after I did this, I restored the original values of the registry entry and it didn't have any effect. What is this IrisService? No one seems to know for sure.

Well, hope it helps!

Fighting the Cunningham defense (King's Gambit accepted)

Published Oct 15, 2022

Posted in
misc
picture
chess

and has 0 comments

Whenever I want to share the analysis of a particular opening I have difficulties. First of all, I am not that good at chess, so I have to depend on databases and computer engines. People from databases either play randomly or don't play the things that I would like to see played - because they are... uncommon. Computer engines, on the other hand, show the less risky path to everything, which is as boring as accountants deciding how a movie should get made.

My system so far has been to follow the main lines from databases, analyze what the computer says, show some lines that I liked and, in the end, get a big mess of possible lines, which is not particularly easy to follow. Not to mention that most computer lines, while beautiful in an abstract kind of way, would never be played by human beings. Therefore, I want to try something else for this one. I will present the main lines, I will present how I recommend you play against what people have played, but then I will try to condense the various lines into some basic concepts that can be applied at specific times. Stuff like thematic moves and general plans. It's not easy and I may not be able to pull it off, but be gentle with me.

Today's analysis is about a line coming from the Cunningham Defense played against the King's Gambit, but it's not the main line. It's not better, but not really worse either. I think it will be surprising to opponents.

We start with King's Gambit accepted: e4 e5 f4 exf4 Nf3, which is the best way to play against the gambit, according to machines and most professional players. Now, Black plays Be7 - the Cunningham defense, with the sneaky plan of Bh4+, Nxf4, Qh4+, forcing the king to move or try to futilely block with g3 and lose badly (or enter the crazy Bertin Gambit, which is very cool, too). Amazingly, there is little White can do to prevent this plan! White entered the King's Gambit with the idea to get rid of the pawn on e5 and play d4, controlling the center. The main line says that, after Be7 by Black, White's light square bishop has to move, Bc4 for example, making way for the king to move to f1.

What I want to do, though, is not to move the bishop, but instead continue with the d4 plan. I am going to show you that moving the king is not that bad and that, succeeding in their plan, Black may overextend and make mistakes. There are some traps, as well. The play, though, has to be precise.

And I dare you to find anybody talking about this particular variation! They all assume Bc4, h4, Nc3 or Be2 are the only options.

Let's start with the main lines, in other words, what people play on LiChess from that position:

As you can see, most played moves are bad, starting with the very first one: Ke2. Instead, what I propose is Kd2, with the idea c3, followed by Kc2. The king is relatively safe and there are some naturally looking moves that Black can end up regretting. Next stop, playing correct moves vs the most commonly played moves on LiChess:

While it doesn't give an obvious advantage - what I propose is not a gambit, just a rare sideline with equal chances, it gives Black plenty of opportunity to make a mistake. Let's see where Black's play on LiChess agrees with Stockfish strategy and note that Black never gets a significant advantage, at the human level closest to perfect play:

Finally, before getting to the condensed part, let's see how Black can mess things up:

Note that there are not a lot of games in this variation, so the ideas you have seen above go just as far as anyone has played in that direction. Computer moves are wildly different from what most people play, that is why machines can be good to determine the best move, but they can hardly predict what humans would do.

In conclusion, without blundering, Black keeps an extra pawn but less than 1 pawn evaluation advantage, meaning White always keeps the initiative, despite having the king in the open.

White's plan is to keep control of the center with e4, d4, followed by c3 and Kc2, Bd3 or Bb5+, Bxf4, Nd2, etc. White's greatest problem are the rooks which need a lot of time to develop. The attack can proceed on the queen's side with Qb5 or on the king's side with pushing the pawns, opening a file for the rook on h1. It's common to keep the dark square bishop under tension on h4, blocking Black's development, but then take it and attack on the weakened dark squares or on the queen's side, once Black's queen in on the other side of the board.

I would expect players that are confident in their endgames to do well playing this system, as most pieces are getting exchanged and opponents would not expect any of these moves.

Here are some of the thematic principles of this way of playing:

After the initial Bh4+, move the king to d2, not e2, preparing c3 and Kc2
- while blocking the dark squares bishop, it uses the overextended Black dark square bishop and the pawn on f4 as a shield for the king
if d5, take with exd5 and do not push e6
- opens the e-file for the queen, forcing either a queen exchange or a piece blocking the protection of the h4 bishop from the queen
- a sneaky intermediate Qa4+ move can break the pin on the queen and win the bishop on h4
if d6, move the light square bishop
- Bd3 if the knight on b8 has not moved
  - allows the rook to come into the game and protects e4 from a knight attack (note that Nf6 breaks the queen defense of h4, but Nxe4 comes with check, regaining the piece and winning a pawn)
- Bb5+ if the knight has moved
  - forces the Black king to move. If Bd7, then Bxd7 Kxd7 (the queen is pinned to the bishop on h4)
- note that if Nc6 now, d5 is not a good move, as it permits Ne4 from Black
- Qb5+ can win unprotected pieces on g5 or h5
once both Black bishops are active on the king's side, c3, preparing Kc2 and opening the dark square bishop to take the pawn on f4
- play c3 on d6 followed by Nc6 as well
Nc6 is a mistake if there is no pawn at d6. Harass the knight immediately with d5:
- Na5 is a blunder, b5 traps the knight
- Nb5 gives White c3 with a tempo, preparing Kc2
- Nb1 is actually best, but it gives White precious tempi (to play c3)
White's c4 instead of c3 is an option, to use only after Bg5, preparing Nc3
- c4 might seem more aggressive, but it blocks the diagonal for the light square bishop
- c4 may be used to protect d5 after a previous exd5
- c4 may be used after a bishop retreat Be7 followed by Nf6 to prevent d5
White's Kc2 prepares Bxf4, which equalizes in most cases by regaining the gambit pawn
Black's Nf6 blocks the queen from defending the bishop on h4, but only temporarily. Careful with what the knight can do with a discovered attack if the bishop is taken.
- usually a move like Bd3 protects e4. After Nxh4 Nxe4 Bxe4 Qxh4 material is equal, but Black has only the queen developed and White can take advantage of the open e-file
- Kc2 also moves the king from d2, where the knight might check it
Keeping the king in the center is not problematic after a queen exchange, so welcome one
- Qe1 (with or without check) after Nxh4, Qxh4 is a thematic move for achieving this
One of Black's best options is moving the bishop back Be7, followed by Nf6. However, regrouping is psychologically difficult
- in this situation c4 is often better than c3
After Bg4, Qb5+ (or Qa4+ followed by Qb5) unpins the queen with tempo, also attacking b7 and g5/h5

I hope this will come in handy, as a dubious theoretical line :) Let me know if you tried it.

Update:

After analyzing with a bunch of engines, I got some extra ideas.

For the d5 line, pay some attention to the idea of taking the bishop with the knight before taking on d4. So Kd2 d5 Nxh4 Qxh4 exd5. I am sure engines find an edge doing it like that, but I feel that at the level of my play and that of people reading this blog (no offence! :) ) it's better to keep the tension and leave opportunities for Black to mess up.

Another interesting idea, coming from the above one, is that the Black queen needn't take the knight! Taking the pawn (dxe4) and developing the knight (Nf6) are evaluated almost the same as Qxh4! This is an idea for Black, I won't explore it, but damn, it seems like no one loves that horsey on h4!

Then there are the ways to protect the doubled pawn on d5, either with c4 or Nc3. Nc3 is not blocking the light square bishop and allows for some possible traps with Bb5+, yet it blocks the c-pawn, disallowing Kc2. c4 feels aggressive, it allows both a later Nc3 and Kc2, but leaves d4 weak. Deeper analysis suggests c4 is superior, but probably only in the d5 lines.

When Black's queen is on h4, White needs to get rid of it. Since the king has moved, an exchange is favorable, but it also removes the defender of the pawn on f4.

Some rare lines, Black plays Nh5 to protect f4. The strange, but perfectly valid reply, is to move the knight back, allowing the queen to see the Black knight: Ne1 or even Ng1.

In positions where the king has reached c2 and there is a Black knight on c6, prevent Nb4 with an early a3.

In the d6 line, Black has the option to play c5, attacking the pawn on d4. Analysis shows that dxc5 is preferred, even when the semifile towards lined up king and queen opens and the pawn takes towards the edge. Honestly, it's hard to explain. Is the c-pawn so essential?

If you've read so far, I think that the best way for Black to play this, the refutation of this system so to speak, is bishop back Bg7 followed by Nf6. And the interesting thing is that the reply I recommended, c4 preventing d5, may not be superior to Nb3 followed by exd5.

On Efficiency vs Efficacity

Published Oct 14, 2022

Posted in
misc
essay

and has 0 comments

There are two different measures of the value of something that sound a lot like each other: efficiency, which represents the value created in respect to the effort made, and efficacity or effectiveness, which on first glance seems to be only about the value created. You are efficient when you build a house with less material or finish a task in less time. You are effective when you manage to finish the task or build the house, when you get the job done. Yet no one will tell someone "Oh, you've built a highway in 30 years, that's efficacy!" (Hello, Romania!). Efficacy is when you consistently get the job done.

Imagine you are a chess player. You are efficient when you can beat people by moving faster than them, by thinking more in the same amount of time. This allows you to play faster and faster time controls and still win. However, think of the opposite situation. You start by being good at bullet chess and then the time controls get slower and slower. You are effective when you keep winning no matter how much time you have at your disposal. Efficacy is also when you keep winning games.

That was my epiphany. Take me, for example. I don't play better chess when I get more time to think. I am not fully using the resources available to me. I can give a lot of examples. I have money, more or less, so do I use it to the best of its value? Hell, no! I suck at both chess and finance. The point is that some people would do well with an average amount of resources, but then they would not do better with more of them. These are two faces of the same coin. One is the short distance lightning fast runner and the other is the marathon runner. Both of them are good at running, but in different resource environments.

Both efficacy and efficiency are relative values, value over resources, a measure of good use of resources: use few resources well, you are efficient, use a lot of resources well, you're effective. It's the difference between optimization and scalability.

Why does it matter? I don't know. It just seemed meaningful to explore this realization I've had, and of course to share it.

Take a good writer who wrote a masterpiece in between working and living. He achieved a lot with less. But what if you give him money so he doesn't have to work. Is he writing more books or better books? In our day and age, scalability has become more important than efficiency. If you provided value for 10 people, can you provide it to 100? It's more important than getting it to be 10 times better.

Can one apply scale economics to their own person? If I thought 10 times faster than everybody, would I have 10 times more thoughts or would I just learn to not think that fast, now that I have the time? You see, it seems that applied to a person, the two concepts are similar, but they are not. Thinking 10 times more thoughts in the same amount of time or taking 10 times less to think the same thoughts might seem the same, but it's the same thing if I compare listening to two people at the same time or listening deeply to a single person. Internally it seems the same, but the external effect is felt differently.

I don't have a sudden and transformational insight to offer, but my gut feeling is that trying to scale one's efforts - or at least seeing the difference to optimizing them - is important.

Rosentreter gambit Testa variation in the King's gambit

Published Oct 1, 2022

Posted in
chess
misc
picture

and has 0 comments

I am not really a King's gambit man, but a friend of mine loves to play it so I've started looking into it and stumbled upon this very interesting variation which I found very instructive. Basically White is gambitting not only a pawn, but also a piece, all for development and immediate attacking chances. Now, if you thought King's gambit is a romantic era chess opening that has been refuted (no, it has not been, but that's all what people remember from the title of an article written by Robert Fischer 50 years ago) then you will probably will think this continuation is lunacy.

Luckily, LiChess makes chess study so simple that even I may sound smart talking about it, so here it is. The position begins from King's gambit accepted - which is the best line for Black according to computer engines, continues with the Rosentreter gambit - allowing Black to chase the important f3 knight away, then White completely abandons the important knight - the so called Testa variation! And then White sacrifices another piece! It's so much fun.

1. e4 e5 2. f4 exf4 3. Nf3 g5 4. d4 { Rosentreter gambit } 4... g4 5. Bxf4 { Testa variation } 5... gxf3 6. Qxf3 { At this point, evaluation is just -1.2, even after sacrificing a piece for a pawn! } 6... Nc6 7. Bc4 Nxd4?? { Black has fallen into the trap. Note that other than the beginning gambit moves and this blunder, all Black moves are Stockfish best moves. } 8. Bxf7+ { Boom! Stockfish evaluation jumps to +5, but getting there takes some time. } 8... Kxf7 9. Qh5+ { No greedy discovered check here as the Queen is under attack. }

Note that there is another similar opening as the gambit I am discussing, also arising from the Rosentreter variation of the King's gambit, where instead of coming up with Bc4, White plays Nc3 - the so called Sørensen Gambit. Similar or same positions may be reached from From's gambit, Vienna gambit, Steinitz gambit, Polerio gambit or Pierce gambit.

Lieutenant Colonel Adolf Rosentreter was a German chess played who lived between 1844 and 1920. He seems to have been a gambit loving chess player, as there are at least two gambits named after him, the most famous being the Giuoco Piano one, which he used to completely destroy a Heinrich Hoefer in 1899, in just 13 moves. Funny enough, in the LiChess player database there are 10 games that end in an identical fashion. I don't know who Testa was, who is possibly the one who should be lauded for this version of the opening.

Anyway, the post is about the gambit in the King's gambit. From my analysis, White can have a lot of fun with this opening. Also, none of the main lines (played the most in LiChess for various ratings) are actually any good, in the sense that the few people who employ the gambit don't know how to implement it best and their opponents usually blunder almost immediately. "Masters" don't use it or at least don't fall into the trap, so keep in mind this is something to use in your own games. So Magnus, don't read this then play it in the world championship or something!

Also, even in the case of Black not falling into the trap, the opening still leaves White fully developed and Black with the burden of demonstrating an advantage. As you can see from the image, the only White piece undeveloped is a knight. Once that happens and the king castles, all pieces are out. Meanwhile, Black has moved only a knight, one that - spoilers alert - will be lost anyway.

Note that I've used Stockfish to play the best moves (other than entering the gambit and accepting the poisoned pawn). For the main lines chapter I've went through the games in the LiChess database. Agreed, there are only a few hundred in total, but that proves how much of a weapon this can become. Hope you enjoy it and as always, I welcome comments.

Without further ado, here is my study for the Rosentreter gambit Testa variation in the King's gambit accepted:

Monstress - a beautiful fantasy manga reminiscent of Berserk (when it was good)

Published Sep 29, 2022

Posted in
misc
manga
picture

and has 0 comments

Monstress is a feudal fantasy manga set in a world of ancients, old gods, humans and half humans. After a period of peaceful coexistence, the humans and the half humans separated in two different countries, which now head towards war. The main character is a girl with mysterious powers, powerful but angry, who is the center of a storm that will either save or destroy the world.

The color drawing is very beautiful, as are all the gods and creatures, very detailed and purposeful. Faces and bodies are expressive. I like the story a lot, clearly thought and love has been poured into it by both writer Marjorie Liu and illustrator Sana Takeda. It reminds me of Berserk, back when it was still good - the first 26 chapters, both the drawing style, with detailed filigree indicating godly power, and the story, which is at time cruel and unforgiving, meant to forge the hero into something spectacular.

At this moment the manga is still going strong, with 41 chapters published. You can read it online here. I highly recommend it. It has earned many awards, including five Eisner Awards, four Hugo Awards, and the Harvey Awards Book of the Year in 2018.

On chess logic

Published Sep 23, 2022

Posted in
misc
chess
essay

and has 0 comments

I caught myself thinking again about the algorithms behind chess thinking, both human and computer. Why is it so hard for people to play chess well? Why is it so easy for computers to come up with winning solutions? Why are they playing so weird? What is the real difference between computer and human thinking? And I know the knee-jerk reaction is to say "well, computers are fast and calculate all possibilities, humans do not". But it's not that simple.

You see, for a while, the only chess engine logic was min-max. A computer would have a function determining the value of the current board position, then using that function, determine what the best move would be by exploring the tree of possibilities, alternating between what a player would most likely do and what the reply would most likely be. Always trying to maximize their value and minimize the opponent's. The value function was determined from principles derived by human masters of the game, though, stuff like develop first, castle, control the center, relative piece value, etc. The move space also increases exponentially, so no matter how fast the computer is, it cannot go through all possibilities. This is where pruning comes in, a method of abandoning low value tree branches early. But how would the computer determine if a branch should be abandoned? Based on a mechanism that also takes into account the value function.

Now humans, they are slow. They need to compress the move space to something manageable, so they only consider a bunch of moves. The "pruning" is most important for a human, but most of it happens unconsciously, after a long time playing the game and determining a heuristic method of dismissing options early. This is why computer engines do not play like humans at all. Having less pruning and more exploring, they come with solutions that imply advantage gains after 20+ moves, they don't fall into traps, because they can see almost every move ahead for three, four or more moves, they are frustrating because they limit the options of the human player to the slow, boring, grinding pathways.

But now a new option is available, chess engines like Alpha Zero and Leela, which use advances in neural network technology to play chess without any input from the humans. They play with themselves millions of games until they understand what the best move is in a position. Unsurprisingly, as neural networks are what we have in our brain, these engines play "more human" but also come up with play strategies that amazed chess masters everywhere. Unencumbered by education that fixed piece value or limited by rigid principles like control the center, they revolutionized the way chess is being played. They also gave us a glimpse into the working of the human brain when playing chess.

In conclusion, min-max chess engines are computer abstractions of rigid chess master thinking, while neural network chess engines are abstractions of creative human thinking. Yet the issue of compressing the move space remains a problem for all systems. In fact, what the neural network engines did was just reengineer the value functions for board evaluation and pruning. Once you take those and plug them into a min-max engine, it wins! That's why Stockfish is still the best engine right now, beaten by Alpha Zero only in very short move time play modes. The best of both worlds: creative thinking (exploration) leading to the best method of evaluating a chess position (exploitation).

I've reached the moment when I can make the point that made me write this post. Because we have these two computer methods of analyzing the game of chess, now we can compare the two, see what they mean.

A min-max will say "the value of a move is what I will win after playing the best possible moves of them all (excluding what I consider stupid) and my opponent will play their best possible moves". It leads to something apparently very objective, but it is not! It is the value of a specific path in the future, one that is strongly tied to the technology of the engine and the resources of the computer running it. In fact, that value has no meaning when the opponent is not a computer engine or it is a different one! It is the abstraction of rigid principles.

A neural network will say "based on the millions of games that I played, the value of a move is what my statistical engine tells me, given the current position". This is, to me, a lot more objective. It is a statistical measure, constructed from games played by itself with itself, at various levels of competence. Instead of a specific path, it encompasses an average, a prescient but blurred future where many paths are collapsed into a value. It is the abstraction of keeping one's mind open to surprises, thus a much more objective view, yet less accurate.

Of course, a modern computer chess engine combines the two methods, as they should. There is no way for a computer to learn while playing a game, training takes a lot of time and resources. There are also ways of directing the training, something that I find very exciting, but given the computational needs required, I guess I will not see it very often. Imagine a computer trained to play spectacular games, not just win! Or train specific algorithms on existing players - a thing that has become a reality, although not mainstream.

The reason why I wrote this post is that I think there are still many opportunities to refine the view of a specific move. It would be interesting to see not a numerical value for a move, but a chart, indicating the various techniques used to determine value: winning chances, adherence to a specific plan, entertainment value, gambit chances, positional vs tactical, how the value changes based on various opponent strengths, etc. I would like to see that very much, to be able to choose not only the strength of a move from the candidate moves, but also the style.

But what about humans? Humans have to do the same thing as chess engines: adapt! They have to extract principles from the new playing style of neural network engines and update the static, simplistic, rigid ones they learned in school. This is not simple. In fact, I would say this is the most complex issue in machine learning today: once the machine determined a course of action, how does one understand - in human terms - what the logic for it was.

I can't wait to see what the future brings.

One Piece, a banal shōnen manga

Published Sep 6, 2022

Posted in
misc
manga
picture

and has 0 comments

Don't want to be mean, especially since I've only read 42 chapters on the 1000+ that make up One Piece, but I found it boring. I did read Inuyasha, Naruto, One Punch Man and even Bleach religiously, but I was younger and had a lot of time on my hands. This one is just another "young male teen with no actual connections meeting friends and enemies and leveling up" story. And it's also very childish.

I enjoy shōnen manga, but this is just too ridiculous for me. I understand it gets better later on, but I've skipped somewhere in the middle and it didn't seem to. Anyway, I guess I can category this as DNF and move on with my life.

If you like it, you can read it free online here: One Piece

Careful with table partitioning in SQL

Published Sep 6, 2022

and has 0 comments

I had this situation where I was trying to optimize a query. And after some investigation I've stumbled upon something strange: querying on the primary key was generating a lot of reads. I was joining my table with a temporary table of 10 ids and there were 630 reads! How come?

At first I thought it was because the way indexes work. The primary key was comprised of RowId and RowDate and, even if I knew theoretically searching by RowId should use the primary key, the evidence was against me: when querying by RowId and RowDate I would get the expected 10 reads.

I created two queries, one with and one without RowDate. I then compared their execution plans. They were identical! Only one took a lot longer, specifically in the Index Seek (which used correctly the primary key). When I looked at the properties for that plan element, I saw something strange:

Actual Partitions Accessed 1..63!

I then realized that the table was partitioned on the RowDate column. In this case, RowDate takes precedence to any indexed column! You might think of partitioning a table like forcefully adding the partition columns to every index in the table, including the primary key. In fact, a partitioned table acts like a number of separate tables with the same definition (columns, indexes, etc.), just different data. The indexes work on each separate partition. When you partition a table, you also partition its indexes.

In truth, I would have expected the query execution plan to show the partition split as a separate step. I understand it's hard to conceptualize it without creating as many execution paths as there are partitions, but still, there should be an indication in the shape of the plan that makes it clear you are querying on multiple partitions.

Once RowDate was used, the SQL engine would choose the one partition of my row, then use the primary key index to seek it. Instead of 63*10 reads, just 10 reads, the number of the rows in the id table.

So be careful when you use table partitioning to ALWAYS use the partition columns in the queries for the table, else you will get as many parallel searches as there are partitions, regardless of the indexes you created, as they are also partitioned.

Hope that helps!

Uzumaki, another creepy Junji Ito manga

Published Sep 2, 2022

Posted in
manga
misc
picture

and has 0 comments

Junji Ito is a manga artist that creates horror, usually focused on personal obsession, body horror and disgust. He is like the Japanese Serge Brussolo, in a way. Uzumaki (translated as Spiral) is a 20 chapter story about a small town infested by spirals, which have more and more horrifying effects as the story unfolds (heh!). Yet perhaps the most horrible thing that transpires from the manga is the typical Japanese social and cultural pressure that keeps people in their place, in their role, denying that anything could be wrong.

I mean, in the beginning, people were leaving the town and coming back, noticing as they did how strange it was compared with the place they were coming from. Later on, the main characters see horrible things happening and still won't leave, while anyone who heard them explain what had happened - even if obvious to anyone looking - refused to accept that it was anything but rampant imagination. And, of course, they stop telling people things, because of the ultimate horror: being stigmatized in their society. That is true horror to me, that people would choose to live their lives like that. The way the town folk end up at the end seems to me like a metaphorical criticism of Japanese culture, but I may be wrong.

Anyway, the drawings are good, imaginative, and the manga succeeds in instilling that pervasive feeling of dread. The story gradually getting weirder and weirder, but in small increments, also manages to hold the reader on the edge of disbelief. Short, too, so no need to invest a lifetime in reading it.

If you want to read it, Uzumaki is freely available online, on the Junji Ito site, but also on a dedicated site that looks very similar, only with an extra bonus chapter, so I would go there. If you are a fan of horror, maybe Brussolo, maybe Lovecraft or even John Saul, I think you will enjoy this a lot.

Fungi, edited by Orrin Grey, Silvia Moreno-Garcia

Published Aug 31, 2022

Posted in
books
misc
picture

and has 0 comments

Fungi is a short story collection, fantasy and sci-fi, mostly hinging towards horror, edited by Orrin Grey and Silvia Moreno-Garcia. I am fascinated by fungi and also a horror fan, so I expected to love the book. Well... it was OK. I enjoyed most of the stories, but to be fair, the fungal influence on most plots was either marginal, like some evil affliction evidenced by mycelium growth, or too obvious, like the pulsating life eating and/or controlling mushroom mass.

It is possible that I bore a grudge from the very moment I started reading the book and expected it to be a novel, only to discover tales too short to get anywhere. It was great to listen to a short while walking the dog and not having to get invested too much, but other than that I was not that captivated. Stories were decent, most of them, but perhaps I was not really in the mood for a collection.

So, bottom line is that I had expectations set way too high and thus was inevitably disappointed. Didn't learn anything more about fungi, because most of the plots were about infestations that required no understanding of the processes involved.