and has 0 comments

  Bad Blood: Secrets and Lies in a Silicon Valley Startup is a book worth of a political thriller miniseries, only too real. It shows the 10+ years history of Theranos, a "unicorn company" formed only on personal charisma and lies and which reached a top valuation of ten billion American dollars at its peak. Billions, with a B! It also shows how that can happen within the American economic, political and social system, which - if you ask me - is much more damning and interesting than the exposure of Elisabeth Holmes and her cronies.

  John Carreyrou also had the faith, training and backing of a powerful journalistic entity just to be able to bring this to public attention, something to be considered in this climate of journalistic consolidation into partisan corporations that care nothing for the truth. It would have been so easy for this to have continued for years, unchecked and uncheckable, if it weren't for this tiny detail.

  To boot, this book will be extremely triggering to anyone working in a corporate environment, especially Americans. Let's play some corporate bingo: sociopath CEO claiming their vision is paramount to anything and anyone, older generation Indian management that feels the lives of employees belong to the company, paranoid NDA backed culture where people just disappear without any mention to the remaining employees, totalitarian control of data, communication and the general narrative, backed by law firms hired on millions of dollars to intimidate anyone who might challenge it, inner-circle privileges given to loyal individuals, university dropout visionaries that consider any technical hurdle something to be solved by lowly multiple PhD holding peons and not something that can hold them back, even if they themselves are technical imbeciles, yes-men culture where dissent or even mere criticism is considered treason, to be punished by immediate termination, public humiliation and legal action. The list can go on...

  I can't recommend this book enough. It's not entertaining in any meaningful way, instead it's terrifying. Imagine being in a situation where you have the knowledge, the certainty, the moral high ground, the awareness of your absolute right in a matter, only to give it all away because someone with a lot of money sics a law firm on you. Imagine bullying at every level once you have haphazardly signed some documents that you assumed were standard corporate operating practice, but instead signed your soul to the company. Imagine trying to tell people that something is terribly wrong, only to be met with dismissive comments on your character and expertise, just because someone believes in a PowerPoint presentation more than in any deity and because you are not part of the in-group.

  But one thing that the book did not discuss, although it implied heavily through out its length, is how can something like this happen. How is it possible that somehow law can be corrupted to stop people from reporting unlawful acts? How can a company be created and thrive and be financed by people on promises alone, while heavily educated and well informed naysayers can be summarily dismissed at any moment and their input suppressed? In fact, this is a direct and informed criticism of the way American society works at the higher levels. Theranos was a symptom that, unchecked, led to Trumpism. There are direct parallels between the mindset of the management in this 2010 company and the political system taking over in the 2020s, with mindless loyal cronies being hired for all of the critical jobs on a wave of populist faith.

  Even more spooky is the strong desire people felt for this book be a hit-job, to have the young female charismatic Elisabeth entrepreneur somehow be the victim of the male dominated system, the disgruntled employees, the Svengali 20 years older lover and irate Indian bully, the vengeful journalist, all working together to stop her from playing her fantasy of becoming the next Steve Jobs. You can imagine a Scooby Doo moment where she could have just made everything work out if it weren't for the pesky kids. But the truth documented in this book shows that, while certainly some sort of victim, Holmes was a mentally deranged individual who still managed to play the entire world and reach wealth and prestige even some nations in the world only dream of.

  Bottom line: you have to read this book, even if it's very long, terrifying, frustrating and its "happy ending" only demonstrates that you have to make a LOT of mistakes for justice to happen when you have enough money and political backing.

and has 0 comments

  Jesus Christ, this book has 100000 chapters! And that's because there are 10 types of people in the world: those who understand binary and those who don't.

  Anyway, Sea of Rust starts off as yet another western with robots, set in a post apocalyptic desert, where machines with guns act and feel like people in a lawless land. But as the book progressed, it became more of an exploration of what means to be alive, what meaning we derive from life and what means to actually live as opposed to just survive. It wasn't a literary or philosophical masterpiece or anything, but it did carry a nice punch to the gut in some scenes. The book became better as it got close to the end, to the point that I could have considered reading the next book in the story, if there was one.

  In fact, there is another book in the series, but it's a prequel. Film writer types, right?

  There are about three major twists, one that you kind of guess from the beginning, one that I should have seen, but never thought about and another that is more or less a hope to get into another genre later on. There are also some major plot holes, but just a few and they underpin the story, so you just have to ignore those if you want to enjoy the book.

  Robert C. Cargill is a film writer as well as books, he did the Sinister films and Doctor Strange and The Black Phone, the last two having seen and enjoyed. His writing is good, although you feel the cinematic nature of it. Writing from the perspective of robots did help with limiting things to just sound and visuals anyway.

  I liked the book, although I didn't feel it spent enough time creating the world or its characters. A world of machines should have been orders of magnitude more diverse and interesting than what's in this. It's one of those stories that need to be told in a certain way, and the world around is just a prop for it. It feels linear and without breadth. If I were to compare it with anything, it would be the Fallout TV series, but seen only from the perspective of the ghoul.

  The anime Dan Da Dan has a funny and endearing first episode where two awkward high school students dare each other to test each other's paranormal/alien beliefs. And what do you know? Both find stuff! But then it gets into the same terrible Japanese clichés of monster of the week and new nakamas and high school dramas.

   The start music is fun, too, which probably hits hard when you first watch the series, but then, together with the content, it fades into the background.

   But the first episode is worth it.

and has 0 comments

  I tried reading some Philip K. Dick in my early years and didn't like it. But not I am older and wiser, ready to process the brilliant ideas in PKD's books. No longer will I feel that a paranoid stoner on a bad trip in the '60s is writing random stories about how reality is not real and consciousness creates new ones again and again and again, just to spite me personally. That's just the hubris and ecocentrism of youth. Right? Right?!

  No. Ubik took me forever to finish because I didn't like it. The writing was good, but inconsistent, moving from philosophical to direct, just like a stoner would when writing about unravelling reality. The characters were there just to push the plot, however flimsy, forward, while the scathing satire of the capitalist system was just caricaturesque and lacking any depth.

  Worst, this is one of those story types that I personally despise. You will understand when you get to the end, if you get to the end, what I mean, because I don't want to spoil the book. It was short and still I dragged myself to finish it. I am sure it was brilliant in 1969, but 55 years later it's just quirky. 

  To be fair, this is not supposed to be one of this best books, so maybe there are some that I will just love if I try hard enough. But I prefer other authors.

Here is the scenario:

  • you have two tables: a source of recent data and an SQL Server destination table with the same structure that has to be updated with that data
  • they are both on the same server (we assume that if the data source is external you have copied it into a local temporary table)
  • the destination table is in active use
  • the source destination may be in active use also
  • the tables are large, locking them and waiting for the merge to finish is not an option
  • the difference between the tables is usually small. Not so small as to be irrelevant, but small relative to the size of the tables
  • we don't care about full data consistency (any intermediate state between the current and the updated one is acceptable)

There are some quirks in SQL Server that affect our task, mainly that if you try to modify 5000 rows or more of a table, then the locking will escalate from row locks to a table lock, turning it unusable during the merge. Also, for operations involving a lot of data, the log of the database will increase to accommodate that size, then the changes will be persisted and the log invalidated. This takes a lot of disk space and if anything happens during the operation, the entire operation will be rolled back, with all of the data in the log having to be restored, which also takes a lot of time, blocking the database.

The first idea is to find rows that are different, new or missing and just sync 5000 of them at a time. One can do this with a MERGE or classic INSERT, UPDATE, DELETE operations. And this works, but it has a major flaw: after the first 5000 rows have been found and synchronized, the next operation will go through them again anyway to find the next 5000 rows, and again, and again, and again. The execution time will increase exponentially with every new batch of rows.

What we actually need is to take a list of changes and apply them in batches, kind of like an artificial transaction log. The following stored procedure will take two parameters: the full schema name of the source table and the full schema name of the destination table. It will create a log of changes, take 4900 of them at a time and apply them to the destination table. The only restrictions are that the destination table has to have primary key columns and the source and destination tables have the same columns. For performance reasons, it's best that the source table also has the same primary keys or at least an index on the same columns. The usage would look like EXEC usp_MergeTables 'SomeSchemaIncludingDbo.SourceTable','MaybeSomeOtherSchema.DestinationTable'

I will explain what it does after the code:

-- this is for batched merge (del, upd, ins) of the data of a table into another (nolock/readpast for reduced locking of source and destination tables)

CREATE OR ALTER PROC usp_MergeTables(@SourceTable NVARCHAR(256),@DestinationTable NVARCHAR(256))
AS
BEGIN

SET NOCOUNT ON

IF NOT EXISTS(SELECT * 
FROM sys.schemas s
INNER JOIN sys.tables t
ON s.schema_id=t.schema_id
WHERE REPLACE(REPLACE(@SourceTable,']',''),'[','')=s.name+'.'+t.name)
BEGIN

  DECLARE @Err1 VARCHAR(100) = 'Source table '+@SourceTable+' not found!'
  ;THROW 50404,@Err1,1

END

SELECT CAST(s.name as NVARCHAR(Max)) as schemaName,CAST(t.name as NVARCHAR(Max)) as tableName,CAST(c.name as NVARCHAR(Max)) as columnName,c.is_computed,c.is_identity, 
CASE 
  WHEN tp.name IN ( 'varchar', 'char', 'varbinary' ) THEN tp.name + 
                  CASE WHEN c.max_length = -1 THEN '(max)'       
                         ELSE '(' + CAST(c.max_length AS VARCHAR(4)) + ')' END       
  --types that have an unicode character type that requires length to be halved 
  WHEN tp.name IN ( 'nvarchar', 'nchar' ) THEN tp.name + 
                   CASE WHEN c.max_length = -1 THEN '(max)'       
                         ELSE '(' + CAST(c.max_length / 2 AS VARCHAR(4)) + ')'       
                                                              END
   --types with a datetime precision 
  WHEN tp.name IN ( 'time', 'datetime2', 'datetimeoffset' ) THEN tp.name + 
                                            '(' + CAST(c.scale AS VARCHAR(4)) + ')' 
  --types with a precision/scale 
  WHEN tp.name IN ( 'numeric', 'decimal' ) 
  THEN tp.name + '(' + CAST(c.precision AS VARCHAR(4)) + ',' +
                                             CAST(c.scale AS VARCHAR(4)) + ')'
  --timestamp should be reported as rowversion 
  WHEN tp.name = 'timestamp' THEN 'rowversion' 
  --and the rest. Note, float is declared with a bit length, but is 
  --represented as either float or real in types  
  ELSE tp.name 
END as typeName,
ISNULL(( SELECT pk.is_primary_key FROM sys.indexes pk
INNER JOIN sys.index_columns ic
ON ic.object_id = pk.object_id
AND ic.index_id = pk.index_id
AND c.column_id=ic.column_id
WHERE t.object_id = pk.object_id
AND pk.is_primary_key = 1
),0) as is_primary_key
INTO #tgtColumns
FROM sys.schemas s
INNER JOIN sys.tables t
ON s.schema_id=t.schema_id
INNER JOIN sys.all_columns c
ON t.object_id=c.object_id
INNER JOIN sys.types tp
ON c.system_type_id=tp.system_type_id
AND c.user_type_id=tp.user_type_id
WHERE REPLACE(REPLACE(@DestinationTable,']',''),'[','')=s.name+'.'+t.name

IF NOT EXISTS(SELECT * FROM #tgtColumns)
BEGIN

  DECLARE @Err2 VARCHAR(100) = 'Destination table '+@DestinationTable+' not found!'
  ;THROW 50404,@Err2,2

END

IF NOT EXISTS(SELECT * FROM #tgtColumns WHERE is_primary_key=1)
BEGIN

  DECLARE @Err3 VARCHAR(100) = 'Destination table '+@DestinationTable+' has no primary keys!'
  ;THROW 50404,@Err3,3

END

DECLARE @operSql NVARCHAR(Max)
DECLARE @delSql NVARCHAR(Max)
DECLARE @updSql NVARCHAR(Max)
DECLARE @insSql NVARCHAR(Max)
DECLARE @identityInsertOn NVARCHAR(Max)=''
DECLARE @identityInsertOff NVARCHAR(Max)=''

IF EXISTS(SELECT * FROM #tgtColumns WHERE is_identity=1)
BEGIN

  SET @identityInsertOn=CONCAT(N'
SET IDENTITY_INSERT ',@DestinationTable,N' ON
')
  SET @identityInsertOff=CONCAT(N'
SET IDENTITY_INSERT ',@DestinationTable,N' OFF
')

END


SELECT @operSql = CONCAT(N'DROP TABLE IF EXISTS #oper

SELECT *
   INTO #oper
FROM (
SELECT ',STRING_AGG(CONCAT(N'ISNULL(src.[',c.columnName,N'],tgt.[',c.columnName,N']) as [',c.columnName,N']'),N', '),N',
  CASE
    WHEN ',STRING_AGG(CONCAT(N'src.[',c.columnName,N'] IS NULL'),N' AND '),N' THEN ''DEL''
    WHEN ',STRING_AGG(CONCAT(N'tgt.[',c.columnName,N'] IS NULL'),N' AND '),N' THEN ''INS''
    WHEN (
    	SELECT * FROM ',@SourceTable,N' R
    	WHERE ',STRING_AGG(CONCAT('R.[',c.columnName,N'] = src.[',c.columnName,N']'),N' AND '),N'
    	FOR XML PATH(''Row''), ELEMENTS XSINIL
      ) <> (
    	SELECT * FROM ',@DestinationTable,N' R
    	WHERE ',STRING_AGG(CONCAT('R.[',c.columnName,N'] = tgt.[',c.columnName,N']'),N' AND '),N'
    	FOR XML PATH(''Row''), ELEMENTS XSINIL
      ) THEN ''UPD''
   END as __oper__
   FROM ',@SourceTable,N' src (NOLOCK)
   FULL OUTER JOIN ',@DestinationTable,N' tgt (NOLOCK)
   ON ',STRING_AGG(CONCAT('src.[',c.columnName,N'] = tgt.[',c.columnName,N']'),N' AND '),'
) x
WHERE __oper__ IS NOT NULL

CREATE INDEX temp_id ON #oper(',STRING_AGG(CONCAT('[',c.columnName,N']'),N', '),N')
CREATE INDEX temp_oper ON #oper(__oper__)')
FROM #tgtColumns c
WHERE c.is_primary_key=1

SELECT @delSql = CONCAT(N'

DECLARE @batch TABLE(',STRING_AGG(CONCAT('[',c.columnName,N'] ',c.typeName),N', '),N',
                   PRIMARY KEY(',STRING_AGG(CONCAT('[',c.columnName,N']'),N', '),N'))

DECLARE @ROWS INT = 1

WHILE (@ROWS>0)
BEGIN

  DELETE TOP (4900) tgt
    OUTPUT ',STRING_AGG(CONCAT('deleted.[',c.columnName,N']'),N', '),N'
	INTO @batch
  FROM ',@DestinationTable,N' tgt (READPAST)
  INNER JOIN #oper src
    ON ',STRING_AGG(CONCAT('src.[',c.columnName,N'] = tgt.[',c.columnName,N']'),N' AND '),N'
  WHERE src.__oper__=''DEL''

  SET @ROWS=@@ROWCOUNT

  DELETE o
  FROM #oper o
  INNER JOIN  @batch b
  ON ',STRING_AGG(CONCAT('o.[',c.columnName,N'] = b.[',c.columnName,N']'),N' AND '),N'
  
  DELETE FROM @batch

  IF (@ROWS=0)
    SELECT @ROWS=COUNT(*) FROM #oper WHERE __oper__=''DEL''

END')
FROM #tgtColumns c
WHERE c.is_primary_key=1


SELECT @updSql = CONCAT(N'

SET @ROWS = 1

WHILE (@ROWS>0)
BEGIN

  UPDATE tgt
  SET ',(SELECT STRING_AGG(CONCAT('[',c.columnName,N'] = src.[',c.columnName,N']'),N', ')
FROM #tgtColumns c
WHERE c.is_primary_key=0 AND c.is_computed=0),N' OUTPUT ',STRING_AGG(CONCAT('inserted.[',c.columnName,N']'),N', '),N'
	INTO @batch
  FROM ',@DestinationTable,N' tgt (READPAST)
  INNER JOIN ( 
    SELECT TOP (4900) s.*
    FROM #oper o
	INNER JOIN ',@SourceTable,N' s
	ON ',STRING_AGG(CONCAT('s.[',c.columnName,N'] = o.[',c.columnName,N']'),N' AND '),N'
    WHERE __oper__=''UPD''
  ) src
    ON ',STRING_AGG(CONCAT('src.[',c.columnName,N'] = tgt.[',c.columnName,N']'),N' AND '),N'

  SET @ROWS=@@ROWCOUNT

  DELETE o
  FROM #oper o
  INNER JOIN  @batch b
  ON ',STRING_AGG(CONCAT('o.[',c.columnName,N'] = b.[',c.columnName,N']'),N' AND '),N'
  
  DELETE FROM @batch

  IF (@ROWS=0)
    SELECT @ROWS=COUNT(*) FROM #oper WHERE __oper__=''UPD''

END')
FROM #tgtColumns c
WHERE c.is_primary_key=1


SELECT @insSql = CONCAT(N'

SET @ROWS = 1

WHILE (@ROWS>0)
BEGIN

  INSERT INTO ',@DestinationTable,N'(',(SELECT STRING_AGG(CONCAT('[',c.columnName,N']'),N', ')
FROM #tgtColumns c
WHERE c.is_computed=0),N') OUTPUT ',STRING_AGG(CONCAT('inserted.[',c.columnName,N']'),N', '),N'
    INTO @batch
  SELECT TOP (4900) ',(SELECT STRING_AGG(CONCAT('s.[',c.columnName,N']'),N', ')
FROM #tgtColumns c
WHERE c.is_computed=0),N'
  FROM #oper o
  INNER JOIN ',@SourceTable,N' s
  ON ',STRING_AGG(CONCAT('s.[',c.columnName,N'] = o.[',c.columnName,N']'),N' AND '),N'
  WHERE __oper__=''INS''

    SET @ROWS=@@ROWCOUNT
	
  DELETE o
  FROM #oper o
  INNER JOIN  @batch b
  ON ',STRING_AGG(CONCAT('o.[',c.columnName,N'] = b.[',c.columnName,N']'),N' AND '),N'
  
  DELETE FROM @batch

  IF (@ROWS=0)
    SELECT @ROWS=COUNT(*) FROM #oper WHERE __oper__=''INS''

END

DROP TABLE #oper
')
FROM #tgtColumns c
WHERE c.is_primary_key=1

DROP TABLE #tgtColumns

--PRINT @operSql
--PRINT @delSql
--PRINT @updSql
--PRINT @identityInsertOn
--PRINT @identityInsertOff
--PRINT @insSql

DECLARE @sql NVARCHAR(Max) = CONCAT(@operSql, @delSql, @identityInserton, @updSql, @insSql, @identityInsertOff)
EXEC sp_sqlexec @sql

END

OK, this is a large thing, but the principles used are simple:

  • first we create a table containing information about the columns of the tables: schema, table, column name, whether the column is primary key or computed, and the type name. The schema and table name are not used, but may be useful for debugging. Note that this SP doesn't check the tables have the same number and type of columns. That's on you to ensure.
  • using the column information we create four strings that will contain the SQL for the following operations:
    • create an "operations table"
    • delete rows that are not needed
    • update rows that need to be updated
    • insert rows that are missing
  • there are four strings mostly for debugging purposes to keep them smaller than the 8000 characters that Microsoft SQL Server Management Studio can print at a time, but they are concatenated and executed as one.
  • implementation details:
    • we use FOR XML PATH('Row'), ELEMENTS XSINIL to generate a string with all the data in each row, so we don't have to compare rows column by column. We could have made this work with comparisons, but the code would have been bulky and ugly when comparing for NULL or for having values. ELEMENTS XSINIL will ensure that there is a difference between empty space and NULL.
    • a FULL OUTER JOIN is used to find (based on the primary key columns of the destination table) if rows need to be either deleted, updated or inserted. That operation is specified in the __oper__ column
    • the operations table is thus created, containing the primary key values and the operation required with two indexes: one on the primary keys and one on the operation. These indexes are not really that relevant, so one could choose to remove them.
    • a @batch table variable is used with a PRIMARY KEY on the primary key columns of the destination table, for performance reasons
    • the batching is done via a DELETE... OUTPUT operation. We delete the 4900 rows we process and we output them in the @batch table
    • for each segment we either: delete destination rows with the same primary keys for 'DEL', update destination rows with the same primary keys for 'UPD' and insert source rows with the same primary keys for 'INS'
    • before we update, we set IDENTITY_INSERT to ON and at the end to OFF, if any identity columns
    • if some rows were affected by the operation, then we continue in the same segment. If not, then we look in the operations table to see if there are still rows to be processed. No rows may be affected while there are still rows to be processed due to the locking avoidance hints
  • improvements to avoid locking:
    • when we generate the operations table we use NOLOCK, which reads uncommitted values ignoring locks. This may not be what you want, but if the source table is locked for whatever reason, this ensures the merge operation is not blocked
    • when we process the batches we use READPAST, which ignores locked destination rows. This ensures that rows that can be updated will be, while the others can be done later, meaning that if you have 1 locked row, the merge operation will go around it, then wait until it is unlocked to update or delete it.

If you want to see what the generated SQL looks like, uncomment the PRINT lines and comment the EXEC one.

Now, I just wrote this stored procedure, so I may have missed some cases. Let me know if you find a situation where this doesn't work as expected. 

Hope it helps!

What is the structure of a table created from another via SELECT * INTO [Second] FROM [First] ?

A simple question, indeed, a basic one, but one that I have never asked myself until today. I honestly believed that the result is a generic table containing the same column names and types and their values and nothing else.

The answer, though, is strange. If the original column is NOT NULL, the resulting column will also be NOT NULL. If the original column has IDENTITY, the column in the second table will also have IDENTITY. And, as you know, you can't add or remove IDENTITY from existing columns without dropping and adding them back. The DEFAULT value, though, is not transferred.

The Microsoft reference page says: 

The format of new_table is determined by evaluating the expressions in the select list. The columns in new_table are created in the order specified by the select list. Each column in new_table has the same name, data type, nullability, and value as the corresponding expression in the select list.

...

When an existing identity column is selected into a new table, the new column inherits the IDENTITY property, unless one of the following conditions is true:

  • The SELECT statement contains a join.
  • Multiple SELECT statements are joined by using UNION.
  • The identity column is listed more than one time in the select list.
  • The identity column is part of an expression.
  • The identity column is from a remote data source.

...

Indexes, constraints, and triggers defined in the source table are not transferred to the new table, nor can they be specified in the SELECT...INTO statement. If these objects are required, you can create them after executing the SELECT...INTO statement.

...

Specifying an ORDER BY clause does not guarantee the rows are inserted in the specified order.

...

When a computed column is included in the select list, the corresponding column in the new table is not a computed column. The values in the new column are the values that were computed at the time SELECT...INTO was executed.

So, my assumptions were wrong, but still what do you do when you want to achieve exactly that: create a new schema-less table with the same columns in terms of name and data type and value only? It's a bit ugly, but I have not found a better alternative. Just UNION ALL with a SELECT that has the same number of (nullable) columns and no rows. like this:

SELECT *
  INTO [Second]
FROM [First]
  UNION ALL
(SELECT NULL, NULL,NULL,NULL WHERE 1=0)

Assuming that the First table had four columns, this will result in a table having the same column names, types and values as the original table, but all columns will be nullable and with no identity.

Hope it helps!

and has 0 comments

  I picked up the book and I went "Oh, no! the writer of this book is called qntm. He must be one of those YA Twitter writers!". No he isn't! Actually, after reading this book, I am determined to read more from Sam Hughes. There is No Antimemetics Division is a wonderful book, blending non Lovecraftian cosmic horror with the spirit of something like The Andromeda Strain in a sort of Fringe-like story, but much better.

  A collection of chapters that contain almost separate short stories, the book slowly constructs this amazing world in which "antimemetic" entities exist. Don't worry about it, a few minutes after you're read my review you will have forgotten about them. The idea is cool and fresh enough, but the amount of attention to detail that the author poured into it raises it to a different level. And then more ideas keep piling up.

  The book starts from quite frightening situations that boggle the mind and challenge the way we see reality, then continues to consistently up the ante to ridiculous scales. Yes, some things are slightly contradictory and hard to accept, but image this is happening in a world that your brain screams it couldn't possible exist, while the clinical scientific and technical viewpoint of the main characters convince you it just as well might.

  I've already put another book from this author on my to read list. I loved this book and I wholeheartedly recommend it.

  P.S. There is a web series on YouTube based on the book. But I do recommend you read it first, then watch the mini series.

and has 0 comments

  I like Sam Neill. He is a good actor and he played in some movies I really like. Funny enough, he doesn't mention those! Instead he focuses in ones that meant more to him and that I mostly haven't watched.

  Did I Ever Tell You This? is a large collection of small anecdotes from the author's life that he decided he needed to write down when he was diagnosed with cancer. They are funny, heartwarming, but strangely impersonal, like stories one tells at a wine table, meant to entertain, not share, offend or expose. For that reason alone it was hard for me to finish the book.

  Imagine being at a party with friends, having fun, Sam bloody Neill being there telling everyone how he most certainly did NOT fuck every woman in London. It would be great, right? Only he keeps talking and talking and talking. Very little about the dramas in his life, the marriages, the children, he just goes on and on about funny things that happened to him, when he was working with people that he thinks are all great, women, and people of color and Aboriginals and all wonderful actors and human beings. It gets old fast! That's this book.

  Now, I like the guy and he came off well out of the book. The problem is that I don't feel like I know him more now than before. He's an average Kiwi, happy to have been chosen to join a great cast of film people and trying to make good with what he got. Humble like. Kind. Funny. Doesn't feel like a real person at all!

  Anyway, the book was fine. It was just overly long and not hitting hard enough.

and has 0 comments

  Nick Cutter went to the store, bought the largest bag of horror tropes and then poured them all into The Deep. Imagine a cross between Event Horizon, It and The Thing, with every other horror cliché you could think of sprinkled in and you get this book. Unfortunately, it doesn't work. Do you feel the horror? YEAH! But does it actually have any impact? No. It gets so horrid so fast that your mind just goes numb and asks itself why is it reading the story at all.

  The Deep has it all: horrible parents, child abuse, loving couple torn apart by child abduction, child fears, parental murder, psychopathy, body horror, supernatural horror, cosmic horror, claustrophobic horror, animal cruelty, interdimensional evil, gore, hopelessness, losing your mind, nightmares, global pandemic and, king of them all, "let's separate!" horror. Well, I am being a bit mean, because by that point nothing really mattered, but you get my drift.

  I guess there are two types of horror as far as I am concerned: intriguing and numbing. The first one is always built on hope, hope that the some character has a chance, if only they would make the best choices and would have a bit of luck, they could pull through. Maybe add some irony, something ridiculous that gives that person an edge when it matters most. The second one is just pointless witnessing of the suffering of another when they have no chance in hell they could pull through. The Deep veers really fast and really soon towards the second category. The horror is strong, but without a reason to exist. And boy does the guy fail to make the right choices!

  Yet, if you watched Event Horizon and thought it was great, like I did, maybe you will love this book, too. Personally I think this felt more experimental than, err... deep.

and has 0 comments

  Yes, I confess, I only expedited the reading of Mickey 7 because there is a Mickey 17 movie adaption with a pretty big budget and cool cast. I already see your eye roll for yet another review about an unreleased movie and not the actual book, but I promise I am writing about what I've read, so stick around :)

  This is a book akin to The Martian or maybe more the Bobiverse books, with which it also shares some plot elements: first person, light action, reasonable and emotionally stable protagonist and capable of being replicated after he dies or, as is the case of this story, when people thought he died. I had fun with it, read it really fast and served as a great palate cleanser after a really annoying book I slogged through before.

  It's not a masterpiece of literature, but it's good and fun. Edward Ashton is a decent writer and if I had an issue with his craft is with people being too consistent in their behavior. They all are neatly placed into their nice little boxes and they never get out of them, even in the face of traumatic deaths (of others or their own). The book also kind of drags, focusing too much on trivialities and less on the interesting aspects of the characters. However, this is the setup book, a first in a series as is tradition, so maybe the next volume, Antimatter Blues, will be better. I intend to read it, too. Maybe not immediately, though. Let's see how I feel about the movie.

  Talking about the movie, I think it's clear they are going to change the plot significantly, therefore the different title. And I get it. The story is about colonizing an alien planet after years of relativistic travel, a lot of internal monologues, flashbacks, and shortened stories of other colonies from books that he reads, but also gruesome deaths and insectile alien life. Hard to cram that into a movie and keep within the budget.

and has 0 comments

  A People's Future of the United States: Speculative Fiction from 25 Extraordinary Writers is one of those books. 25 different short stories about how liberals, gays, women, non neuro-normative people and people of color are abused or openly hunted by possible future American systems. And the regimes described are quite awful indeed, yet the overall feeling one gets from reading the book is just eye rolling disbelief. I only read it because I like how Victor LaValle writes, problem is he just edited this collection and wrote none of the stories inside and it was a grind to get this read.

  I blame the first story. It was so beautiful and reasonable, where a librarian shelters people from both sides of a divided America and they come together in discussing a single book they had all read. It set a tone that I couldn't wait to get more of. And the second story was the most strident, hysterical, yet at the same time bland piece of literature I've ever read! And it was followed by a lot of similar stories, where everybody who was gay, of color, autistic, female and sometimes all of the above was openly and legally terrorized by a system run by the evil conservative Whites. The tonal shift was so brusque I felt literary whiplash!

  Maybe when your subject is systematic abuse of minorities and you're also part of one or multiple of these minorities, it's pretty hard to get openly rejected. That's the only reasonable explanation I could find for the wide variety of quality in these stories. There were some that were really good. Unfortunately, only a few of them and most of the others could only kindly be called mediocre.

  I just couldn't believe it! The same people who were afraid of an ever more improbable future (The Handmaid's Tale feeling heavenly in comparison) in which non-normalcy is illegal, were describing with glee how the wheel was turning. For example, there was one where a genetic time travelling bomb backfired and all of America was now diverse. That kind of laughable diversity, with no national identity, just a bunch of people all being different from each other, looking different, yet having a united community and all of the American problems magically gone.

  I couldn't shake a quote from my head: slaves don't dream of freedom, but of becoming masters. The same people that were afraid of intolerance wrote short stories about us vs them, where "them" were caricaturesque inhuman villains and deserved everything coming to them.

  For sure there is some emotional catharsis to imagine a terrible future in which everything you hate is openly evil, thus giving you permission for all the fear and hate and extreme fantasies. Imagine Death Wish, but now the hero is a gay woman in a wheelchair killing fraternity bros. How is that not a movie already? Barf!

and has 1 comment

I have been maintaining a Chrome browser extension I wrote for more than a year now and I always get the occasional user asking me if I can make it work with Firefox. And until now I said no, because I use the "world" feature in manifest.json. But I was wrong.

You see, Google is aggressively pushing for manifest version 3, discontinuing support for browser extensions using v2. So my extension was written from the beginning with version 3. In order to load the scripts and CSS files that I needed to run in the context of the page, I used the world:"MAIN" feature. When I tried it with v2, Chrome immediately told me "The 'world' property is restricted to extensions with 'manifest_version' set to 3 or higher." (Or higher. lol) So when I looked for how to use world in Firefox manifest v3 I got a lot of documentation about Xray Vision and how they absolutely refuse to implement the world feature.

I googled, I scoured the web, I read all of Stack Overflow, I even tried LLMs like ChatGPT of Gemini to hilarious results, like Gemini accusing me I want to circumvent the security of the browser. No. There is no simple way of doing the same thing in Manifest v3 on Firefox. Case closed, right? I mean, who even uses Firefox? Less than 2.5% of people on the Internet. And it's a shit browser.

But no, here comes a guy and tells me that he made it work. HOW?! By now you probably guessed it, I left enough hints after all. You just have to use manifest version 2! 'world' is only restricted for version 3 in Chrome browsers!!!

Although now that I am writing this post I see stuff like Manifest V3 updates landed in Firefox 128 so maybe it something that works only recently (the article is from July 2024). Hmm..

I guess there is a silver lining to the fact I refused users for a year now, because I had a year in which I didn't have to deal with the buggy debugger (heh!) in Firefox.

Anyway, if you have the same problem, that's your solution: make a v2 manifest for Firefox and a v3 manifest for Chromium browsers. Or use it in v3 as well, because apparently Firefox changed their minds. And I wrote this blog with such glee that I would help people that had the same problem as me. Ugh!

NO! I will not make it work for Safari! Eat a dick!

  Just a few days ago I was writing on how important it is to tell Entity Framework what SQL type to use in order to avoid costly conversions. In fact, it wasn't so much an EF issue as it was an SQL one. Converting even character types to character types or changing collation was surprisingly expensive.  In this post I will show you how important it is to choose the right type for your querying columns, especially the primary key. 

  First imagine this scenario: you get some data from an outside source, rows and rows of it, and you have to store them and query them in SQL. The value that uniquely identifies a row is a small string, maybe 50 characters long. How do you proceed?

  My first naive solution was the most obvious one: just create a table that has a column for each value in the rows and put the primary key on the identifying one. But this leads to immediate performance losses:

  • by default, a primary key is a clustered index - text is not sequential, so at every insert the database engine will physically move huge swaths of data in order to place the rows in the alphabetical order of their identifiers
  • a primary key is a unique index - meaning text will have to get compared to other text in order to determine uniqueness, which is slow
  • by default, SQL is case insensitive - meaning that all text comparisons will have to be made taking into account capitalization and accents
  • 50 characters is a lot - even without Unicode support, it's 50 bytes, which is 12 times more than an integer, meaning the primary key index will be large; and slow

  "But!", you will undoubtedly say, if you put the primary key on some other column, you will still have to create a unique index on the identifier. Isn't this just pushing the problem farther down the road? The size and speed limitations will be the same. And primary keys are clustered only by default, but they can be declared as not clustered. And SQL doesn't need to be case insensitive, all you have to do is change the collation of the column to be binary and it will be compared faster. Wouldn't that solve the problem?

  No. In fact, my final solution which worked five times faster, did not have an index on the identifier column AT ALL. Incidentally, I did end up changing the collation, but only because the idiots sending me the data were doing it case sensitive.

  Without further ado, here is what I did:

  • an INT column with IDENTITY(1,1) as the primary key - which ensures a fast insertion due to the sequential nature of the value, fast query speed and low usage of disk space for the index
  • an INT column holding the checksum of the identifier - which when indexed, is fast to query and doesn't use a lot of disk space for the index

   So how do I query on the identifier? Simple: I calculate the checksum of the string and then I look it up in the database - which uses the index to locate the few strings that have the same checksum, then just finds the right one by enumerating through them. I query on the checksum column AND the text identifier. And there is an added bonus: I only need to do this once. If I need the record from the DB again, I query it directly through the integer primary key.

  Entity Framework has this automatic memory cache so when I am querying on the database entity using a business model - as good separation of concerns practice would dictate - it gets it really fast from memory. Because the memory cache also uses just the int to identify an entity, which means double the benefits!

  The eagle eyed reader will have noticed that I am not using a unique index on the identifier, so technically I could create multiple rows with the same one. However, my application is always looking for the existing record first. But if you really worry about data consistency, the index on the checksum column can be replaced with a unique index on the checksum and identifier column. It will take more space, but it will be just as fast.

  Another thing that you may have noticed is that I use a code checksum, not the database provided functions to achieve the same. At first glance, it's an instant win: just create a persisted computed column that calculates the checksum or binary checksum of the identifier column. However, this would be weird when having to query, since you would have to craft a stored procedure or a custom SQL command to get the identifier and query on its checksum. In my case I just calculate a checksum - and not use the lazy string.GethashCode function which may be subject to change and it's already different between 32 and 64 bit systems.

  Of course, if you want your text columns to be case and/or accent insensitive, you will have to store the hash code of the lowercase and unaccented string or use an implementation that is case and accent insensitive. This may not be trivial.

  Further tests showed that just using a non clustered index on the identifier column, even a unique one, was just slightly slower, maybe 5%. However, the space taken by indexes increased by 20%. So I might understand why you would find it a bit off putting and skip the checksum part.

  Hope this helps!

  P.S. Why did this solution provide such a huge performance gain? Obviously the SQL team would have implemented a sort of checksum for their text index, this should have been working natively and faster than any possible implementation I could make. Well, I don't know the answer. In fact, this all could be some quirk of Entity Framework and the SQL queries would not be optimizable to such a degree. I will attempt to test that using purely SQL commands. But meanwhile, all the points I made above are valid and with a little more work you can have a lot more control on how the system works.

  What do you remember from the Terminator movies? It's the Skynet killer robot, obviously, the people who seem to always be related somehow, and a hero that needs saving for the sake of their work in the future, but running for their lives in the present. In Terminator Zero you get all of these, to the point that they feel a little overdone. But the animation is good and the story is interesting, adding some logical elements that I've only seen in Terminator: The Sarah Connor Chronicles, which I liked a lot and wanted more of. I loved that they set the action in a clearly different timeline than our own and also tried to make it clear the ridiculous cycle of trying to fix the past from the future.

  Unfortunately, they've decided to add children to the mix. And I mean children that need a nanny, not 24 year old Claire Danes. Most of the time it's the children and their very Japanese emotions filling the screen, while their father, a mysterious tech mogul, keeps saying cryptic things almost until the end of the movie for no good reason. The Terminator is thankfully not in the shape of Arnie and the human fighter from the future is a woman. It also is set in Japan. The series ends with a promise rather than with closure, although I doubt they will make a second season.

  It's eight episodes of 20 minutes each, but I think the story was a little too simple for 160 minutes and it could have easily been a more concise two hour animation film. What's the difference, really, between a series you release all at once and a feature film anyway?

  While I applaud stories said in animation - readers of this blog may already know that I believe that's how you do and say brave things today, especially in sci-fi and horror - being a Terminator story meant it was locked in some preestablished framework and couldn't be too creative. Just consider taking some pages out of Screamers, for example, and you understand what I mean. I would watch seasons and seasons of Terminator anime than hope for something decent in live action anymore. The thing is that they already are very far advanced in special effects, but those also cost a lot of money, meaning that you either underdeliver on viewer expectations or have to make a whole bunch of money to break even. Animation is not like that and it's also a lot more flexible.

  All in all I liked the show and I recommend it, but don't expect too much.

  I've built an application and, like any lazy dev out there, I focused on the business logic, the project structure, the readability, comments, the dependency injection, the unit tests, you know... the code. My preference is to start from top to bottom, so I create more and more detailed implementations of interfaces while going down to the metal. The bottom of this chain is the repository, that class which handles database access, and I've spent little to understand or optimize that code. I mean, it's DB access, you read or you write stuff, how difficult can it be?

  When it was time to actually test it, the performance of the application was unexpectedly bad. I profiled it and I was getting reasonable percentages for different types of code, but it was all taking too long. And suddenly my colleague says "well, I tried a few things and now it works twice as fast". Excuse me?! You did WHAT?! I have been trying a few things too, and managed to do diddly squat! Give me that PR to see what you did! And... it was nothing I could see.

  He didn't change the code, he just added or altered the attributes decorating the properties of models. That pissed me off, because I had previously gone to the generated SQL with the SQL Profiler and it was all OK. So I executed my code and his code and recorded the SQL that came out:

  • was it the lazy loading? Nope. The number of instructions and their order was exactly the same
  • was it the explicit declaration of the names of indexes and foreign keys? Nope. Removing those didn't affect performance.
  • was it the ChangeTracker.LazyLoadingEnabled=false thing? Nope, I wasn't using child entities in a way that could be affected.
  • was there some other structure of the generated SQL? No. It was exactly the same SQL! Just my code was using thousands of CPU units and his was using none.
  • was it magic? Probably, because it made no sense whatsoever! Except...

Entity Framework generates simple SQL queries, but it doesn't execute them as you and I would. It constructs a string, then uses sp_executesql to run it. Something like this:

exec sp_executesql N'SELECT TOP(1) [p].[ID], [p].[TXT], [p].[LUP_TS]

FROM [sch].[table] AS [p]

WHERE [p].[ID] = @__p_0',N'@__p_0 nvarchar(64)',@__p_0='xxxx'

Do you see it? I didn't until I started to compare the same SQL in the two versions. And it was the type of the parameters! Note that the aptly named parameter @__p_0 is an NVARCHAR. The actual column in the database was VARCHAR! Meaning that the code above was unnecessarily always converting values in order to compare them. The waste of resources was staggering!

How do you declare the exact database type of your columns? Multiple ways. In my case there were three different problems:

  • no Unicode(false) attribute on the string columns - meaning EF expected the columns to be NVARCHAR
  • no Typename parameter in the Column attribute where the columns were NTEXT - meaning EF expected them to be NVARCHAR(Max)
    • I guess one could skip the Unicode thing and instead just specify the type name, but I haven't tested it
  • using MaxLength instead of StringLength - because even if their descriptions are very similar and MaxLength sounds like applying in more cases, it's StringLength that EF wants.

From 40-50ms per processing loop, it dropped to 21ms just by fixing these.

Long story short: parametrized SQL executed with sp_executesql hides a possible performance issue if the columns that you compare or extract have slightly different types than the one of the parameters.

Go figure. I hate Entity Framework!