Why this article should never have been written

  It's a bit too early to do this. I am sure no one in their right mind would want to display any non-positive words about George Floyd at this moment for fear of reprisals. But I feel like that is exactly what must be done. If Floyd was an innocent victim, a hero that overcame his background only to be brought down, literally, by the heavy boot of law enforcement, then what chance do normal people have?

  I didn't want to be writing about this. I think that the entire thing has become grotesque and the only thing that could now bring these Whites and Blacks together, corrupt police and gangstas, racists and anti-racists... is me! I am sure that if I were to enter the argument, all these people angrily hating each other would come together in trying to murder me. Because while I understand both sides, I can't fully endorse any of them. Racists are just dumb. What the hell does race matter in anything? But I do understand anti-anti-racists, the ones that hate being put together with assholes because of the color of their skin. Anti-racism protesters are dumb. Well, maybe. I am sure all of this protesting is finally going to have an impact, and this is good, but have you seen some of these people? Manicaly jumping on toppled down statues and roaring in megaphones about how great they are because they oppose evil. In this whole discussion, again, normal people are left out. They are boring, they don't clump together, they don't stand out, each one has their own slightly different opinion. Well, this is mine.

The gentle giant saint versus the black monster

  Something happened today that pushed me to write this post. I saw a Facebook post that detailed the criminal record of George Floyd. Cocaine dealing, two armed robberies, one which held him back four years, addiction and, when he was arrested, metamfetamine and fentanyl in his blood and the incriminating fake twenty dollar bill. Was it true? It is a very important question to ask, because many of these things are complete bullshit. So I googled it. And the result was actually worse: almost nothing!

  There are just two websites that actually advertise Floyd's criminal record: Great Game India - self titled "Journal on Geopolitics and International Relations" and sporting articles like "Coronavirus Bioweapon - How China Stole Coronavirus From Canada and Weaponized It" and "How A Pornstar & Sci-Fi Writer Influenced WHO Policies On Hydroxychloroquine With Fake Data" - and The Courier Daily, which seems legit. Funny though, when you search for "George Floyd criminal record" you get Game India first and not The Daily Mail, which is linked in their article and who actually did the research and published the court documents attesting to that. They are fifth on the search page. More, during the writing of this blog post, the Courier Daily link disappeared from my Google search and Game India was demoted to second place, with a "gentle giant" story on top instead.

  Either way, other than Game India, no other news outlet even phrases the title as to indicate George had been a criminal. The few who tackle the subject: The Star, The Daily Mail itself and even The Courier Daily, just portray the man as a flawed individual who nevertheless was set to change, found religion and even moved to Minneapolis to escape his past. And I agree with this viewpoint, because as far as I can see, the armed robbery had been many years before and the man had changed, in both behavior and intent. But hiding this doesn't really help. The Daily Mail article was published on the 26th of May, one day after Floyd's death, and the information therein is either not discussed or spun into a "gentle giant" narrative. He was a bouncer before the Coronavirus hit and he lost his job. George the gentle bouncer?

  One thing is certain, when you search for George's criminal record, it's more likely you get to articles about the criminal records of the arresting officers or Mark Wahlberg's hate crimes than what you actually searched for.

How did George die and why it doesn't matter

  But there is more. How did George die? You would say that having a knee on their throat while they gasp for air saying "I can't breathe" would be enough. But it's not. Several different reports say different things. The first one preliminarily portrays Floyd as a very sick man: coronary artery disease, hypertensive heart disease, even Covid-19. There were "no physical findings that support a diagnosis of traumatic asphyxia or strangulation", but instead they diagnosed it as a heart failure under stress and multiple intoxicants. Finally, two days later, the report admits "a cardiopulmonary arrest while being restrained" by officers who had subjected Floyd to "neck compression". But Floyd's family would not have that, so they commissioned their own autopsy. The result? Floyd died from "asphyxia due to compression of the neck", affecting "blood flow and oxygen going into the brain", and also from "compression of the back, which interferes with breathing". The medical examiner said Floyd had no underlying medical problem that caused or contributed to his death.

  So which was it? It doesn't really matter. He died with a knee on his neck, which should never happen to anyone, and both reports admit it was a homicide. But ignoring all of these other data points doesn't help. People just vilify the policeman and raise George to saintly status. You want to solve something? Start with the truth. All of it. Now both sides have the ammunition required to never change their minds.

  I have not found any article that makes a definitive claim on which report is the good one, if any. They all lean on believing the second, because it fits, but if the first one was a complete fabrication, why wasn't anyone charged with it?

Wikipedia v. Facebook

  So of course I would find about Floyd's criminal past from Facebook. It makes so much sense. It is a pool of hateful bile and rank outrage that brings ugly right up to the surface. But this time it pointed me towards an interesting (albeit short) investigation. Without it, I would have swallowed up the entire completely innocent victim narrative that is pushed on every media outlet. So, once in a blue moon, even Facebook is good for something.

  As you may have noticed above, I took some information from Wikipedia, which has an entire article dedicated to George Floyd's death. It is there where the information about his two medical autopsies is also published. On George Floyd's page, his early life consists of: basketball, football, his family calling him a gentle giant. Then he customized cars, did some rap and was an informal community leader. Only then did he get arrested a few times then put in jail for five years. He was charged in 2007, went to jail in 2009 and was released on 2013. It's just six years and it does not define a man, but try to say that to a police officer who has just read the fact sheet on his cruiser's terminal and has to arrest a 1.93m tall intoxicated man.

  And you may want to read the entire chain of events. The police didn't just put him on the ground, they talked to him, they put him in their car, they fought, they pulled him out, all while being filmed and surrounded by a crowd.

You will never gonna get it

  How much of this is truth and how much of it is spin? You will never know. There are so many people that have to justify their own shit using carefully chosen bits and pieces from this that there will never be a truthful image of who George Floyd was and what happened to him. He is now more than a man and also much less: he is a symbol, rallying people to cry out against injustice, as they see it. The greatest thing George Floyd ever did was die and after that he stopped being human. How sad is that?

  In truth, he was a flawed man. He was not perfect. He was your everyman. A policeman casually killing him while getting filmed doing it hurts on so many levels because that could be you. That was you or your friend or your relative somewhere. But no, they had to make it about being black and being the gentle giant and being killed by the bad psycho cop and his cronies. It sounds like a Hollywood movie because it is scripted as one. You can be certain that at this point several documentaries and movies are in the works about this. And when you'll see it, a big time celebrity will be interpreting Floyd and the cop will be played by that actor who plays psychos in every other movie because he has that face. Once you go black, you can never go back.

  I am not defending anyone here. As I said in the beginning, I am on nobody's side in this. I just hope no one will knee me or my friends to death while everybody films it down.

The world has spoken

  I find it amazing that the protests in Minneapolis have spread to the entire world. It makes me hope that they will slowly turn into protests about things that matter even more than the color of one's skin, like our responsibility as a community to carefully choose our guardians, like having to think for ourselves if something is right or wrong and maybe doing something about it. George Floyd was killed slowly, over nine minutes, while people stood around and filmed it. Not just the other officers, but civilian bystanders, too.

  There were people who did something. At one point a witness said: "You got him down. Let him breathe." Another pointed out that Floyd was bleeding from the nose. Another told the officers that Floyd was "not even resisting arrest right now". Yet another said "Get him off the ground ... You could have put him in the car by now. He's not resisting arrest or nothing. You're enjoying it. Look at you. Your body language explains it." But that's all they did. Wikipedia calls them "witnesses", but you have to wonder: what skin color were they? Were they afraid they would be next and that's why all they could was beg for George's life as he slowly died? Or did they believe the story that American TV has fed them for decades, that cops are ultimately good people who break the rules in order to protect the innocent? Or maybe a more recent belief had taken hold: that filming injustice makes you a hero and it's more than enough.

  The world has spoken. Racism must go, police brutality must go. Let's not replace them by carefully crafted fantasies, though. Let's see the truth as it is so we can make it better.

2020 is great so far

  I am not being sarcastic. After a virus that punched presidents of the free world and dictators alike in the nose, that made people question their fake feelings of safety and forced them to act, here comes this age of protesting how things are. We have been shaken awake. Will we fall asleep again? I am sure we will, but some things will have changed by then. And the year is not yet over.

I know I am shooting myself in the foot here, but, to paraphrase some people, staying quiet doesn't help anyone. I've come to love Dev.to, a knowledge sharing platform aimed at software developers, because it actually promotes blogging and dissemination of information. It doesn't do enough against clickbait, but it's great so far. So, hungry for some new dev stuff, I opened up the website only to find it spammed with big colorful posters and posts supporting female devs. It was annoying, but it got me thinking.

  I like women in software! I too can honestly say I support them. I've always done so. I worked with them, mentored them, learned from them, worked for them, hired them. I want them to get paid what they are due, just like any other person: quiet, happiness, money, respect, understanding. I support their right to tell me when they hate (or love) something I do or say and I am totally against assholes who would pray on them or belittle them. Not because they are women, but because they are human, and no one should stand for stupid little people who only think of themselves and have a chip on their shoulder.

  And yes, women need more support than men, because they traditionally did not have it before. For them it is an uphill battle to fit into communities that contain few females. They have to butt in, they have to push and struggle and we need to understand their underdog status and protect them through that. But not because they are some fantasy creature, or perpetual victims or some other thing, but because they are people. This applies to them, to minorities, to majorities, to every single person around you. I would feel the same about some guy not getting hired because he is too muscular as for some woman who won't get a job because she's bland looking.

  So ask yourself, are you really supporting women, or are you just playing a game? Are you the one shouting loudly in the night "Night time! Everybody go to sleep!"? Are you protecting women or singling them out as something different that must be treated differently? Are you actually thinking of people or just doing politics? Because if you decide to annoy devs on behalf of women, you'd better do a good job supporting them for real.

  I guess I don't have to tell you about ad blockers and browser extensions that improve YouTube. They are a dime a dozen and bring many features to the habitual YouTube watcher. However there is one particular new YouTube annoyance that you don't really need an extension to get rid of: the dreaded Video paused dialog.

  To get rid of it is easy: on an interval, check if there is a visible element of a certain type containing a certain text and click its button. While this can be done in simple Javascript, I am lazy, so the script that I am using will first load jQuery, then run a one line function periodically. This code is to be copied and pasted in the Console tab of the browser's development tools, then press Enter.

const scr = document.createElement('script');
scr.setAttribute('src','https://code.jquery.com/jquery-3.4.1.min.js');
document.querySelector('head').appendChild(scr);
setInterval(()=> { 
  $('#button:contains(Yes)','yt-confirm-dialog-renderer:visible:contains(Video paused)').click();
},500);

It's easy to understand:

  • create a script element
  • set its source to jQuery
  • append it to the page
  • execute every 500 milliseconds a code that:
    • finds the element with id button containing the text "Yes"
    • inside an element of type yt-confirm-dialog-renderer which is visible and contains the text "Video paused"
    • click the element

There is an even more comfortable solution, though, that I recommend. You will need a Chrome extension called cjs that loads whatever script you tell it in whatever page you want. It gives you the option to inject jQuery, so all you have to do is write 

setInterval(()=> { $('#button:contains(Yes)','yt-confirm-dialog-renderer:visible:contains(Video paused)').click(); },500);

 as the script to be executed on YouTube.

That's it. You're all done.

  I see a lot of pages about how to write blog posts. I read them, because I am both curious and sincere in my desire to make my blog popular and spread the knowledge I've amassed here. They are always crap. Take one that says the best tool to get a blog popular is to use Google Trends or Google autocomplete to see what people are searching for. And the results are always incredibly stupid. Like "how to add one to one to get two". I am paraphrasing a bit here, but you get the gist. Go "worldwide" and the first trend is always some Chinese spam. Another post is saying that a blog post should be written as four drafts: one for what you want to say, one for how you want to say it, one for peer reviewed content and the final one that actually is what you want to publish. It sounds great, but it implies a level of work that sometimes is prohibitive related to the subject of your post. Sometimes you just want to share something as a stream of consciousness and be done with it. Is that better? No. But it sure beats NOT writing anything. There is always time to improve your work and get peer review AFTER publishing it.

  There are two big types of people blogging. The first category is akin to reporters and media people. They want to get their message across for reasons that are rather independent of the message itself. They want to earn money or influence or some other kind of benefit. I don't have any advice for people like that. Not because I disconsider their goals, but because I have never blogged for an ulterior reason. The second category of bloggers is akin to writers: they want to get their message across because they feel there is some value in the message itself. I consider myself such a person, although I probably suck as a writer. This post is for people like that.

  The most important part of writing a post is motivation. And I don't mean just the reason for writing it, but the reason for wanting to share it. For me, most of the posts I write are either content that I consume, such as books, or stuff that I think is worth considering or technical stuff that I've stumbled upon and I believe people would want to find if googling for it instead of wasting the time I wasted to solve it. Now, the books and the personal idea posts I totally agree are ego boosting shit: I feel like it's important enough to "share", but I don't really expect people to read it or that there is any inherent value in them other than getting to know me better. And everyone wants to understand other people better on the Internet, right? In the end they are just a personal log of random thoughts I have. My blog is something that represents me and so I feel that I need to share things that are personal to me, including thoughts that are not politically correct or even correct in any possible way. One can use Facebook for this, so I won't write about those posts. They still reach some people and inform their choices, which is something I love.

  What is left is the posts that I work for. You have no idea how much I work on some of these posts. It may take me hours or even days for content that you read in a few minutes. That is because I am testing my ideas in code and creating experiments to validate my beliefs and doing research on how other people did it. A lot of the times I learn a lot from writing these posts. I start with the expectation that I know what I am talking about only to find out that I was wrong. The important part is that I do correct myself and some of the blog posts here are exclusively about discovering how wrong I was. There is nothing more rewarding than writing something that you feel others might benefit from. Perhaps other than getting feedback about how your post benefited others. Publishing your failures is just as important as publishing your successes.

  Yes, I know, if I learn something new by doing what I need to be doing, then sharing the results is like writing for myself, too. It's ego boosting, for sure. However, it would be even more obnoxious to believe no one is like me and so no one would benefit from the same work. There was a time when people came to my blog and asked me about all kinds of technical problems and I worked with them to solve them. There were usually incredibly simple problems that posed difficulties only to the laziest people, but it felt good! Then StackOverflow came along and no one actually interacts with me. But I have solved stupid problems that I still keep getting thanks for, even (maybe especially because) if the technology is really old and obsolete. Many other blogs published cool things about subjects that are not fashionable anymore and then just disappeared. The value of your content is that it may help people in your situation, even if they don't share your sense of now and even if all they take away is how NOT to do things.

  Sometimes you are looking for the solution for a problem and after hours of work you realize the problem was invalid or the solution was deceptively simple. It's the "Oh, I am so stupid!" moment that makes a lot of people shy away from writing about it. I find that these moments are especially important, because other people will surely make the same mistake and be hungry about finding the answer. OK, you admit to the world you were stupid, but you also help so many other people that would waste time and effort and feel as stupid as you if not for writing the post.

  My take on writing a blog post is that you just have to care about what you are writing. You may not be the best writer out there, you might not even be explaining the thing right, but if you care about what you are writing, then you will make the effort of getting it right eventually. Write what you think and, if you are lucky, people will give you reasons to doubt your results or improve them. Most likely people will NOT care as much about the subject as you, but you are not writing because of them, you are writing for them. Some of your thoughts and toils will reach and help someone and that is what blogging is all about.

  The last thing I want to mention is maintenance. Your work is valid when you write it, but may become obsolete later on. You need to make the effort to update the content, not by removing the posts or deleting their content, but by making clear things have changed, how they did and what can be done about it. It is amazing how many recent posts are reached only because I mentioned them in an "obsolete" post. People search for obsolete content, find out it's too old, then follow the link to your latest solution for that problem. It makes for good reading and even better understanding of how things got to the current point.

  So, bottom line: publish only what you care about and care about your readers, keep the posts up to date, publish both successes and failures.

As I was saying in a previous post, I came back from vacation to about 5 DMCA notices about posts that had no infringing content that I could see. I've tried changing the URLs of links, the images, then reposted the posts. DO NOT DO THAT ON BLOGGER! Apparently, even if their rule says to not publish the post unchanged, they consider any republication of a post automatically put in Draft because of a DMCA notice to be infringing. And what happens after trying to republish five posts a few times? Your blog gets deleted and any attempt to contact the assholes at Blogger is ignored or, worse, a complete farce.

Example: A DMCA requests comes for a blog post containing only my text, a link to IMDb and a YouTube embed. The post gets sent to Draft. I send a counter notice to Blogger, then republish the blog. This might be considered "infringing" of their policy, but it only happened once, so I don't think this is the reason why they deleted the blog. Then they delete the blog (all the blogs on that user, I had some private ones that I hadn't backed up at all and now they are completely gone) and send me an email containing, I kid you not, this text: "we have removed the blog. Thank you for your understanding."

I've tried contacting them about my user being blocked from Blogger and the blogs deleted, but they have replied only with an automated message saying it usually takes two days to get back to me. Two weeks later, no reply from them... except the ones for the DMCA counter-notices. Guess what they said? I kid you not: "Thanks for reaching out to us. We have no record of the following URLs having been removed by Google due to a legal complaint". I replied with a quote of their previous email. No reply.

So the only solution, until I find my own hosting and I set up my own blog, is to rename the blog and move it on another user.

This is a cautionary tale, too. Always backup your blog and any content you post on another platform, no matter how "no evil" it is. Also, make sure you backup everything, even the little details, checking the result, because backup and restore tools don't bring money to the platforms, so they usually suck big time. For example for Blogger there is no way to import a post with a JavaScript script in it that contains comments, because the newlines have been stripped and the comment affects the next line of code. And most of all, when your post gets DMCA'd, leave it in draft, file a counter-notice and pray that they actually mean something.

On the administrative side, because of the domain change and the backup SNAFU, the blog might not work completely as expected. Please let me know if anything goes wrong so I can fix it. For example, the links to the blog are automatically converted by JavaScript, because I can't export, replace and import the posts again without losing the JavaScript in some pages and posts. Plus, I have no idea if this blog is going to last.

Wish me luck, folks!

You will quickly understand why I felt the need to say I was unbiased, but let me first demonstrate how much unbiased I was: I went into this raw fruits store, with an errand from the wife, and wanted to get something from me. Usually I like the caju and macadamia nuts, but I didn't want to have the conversation about why did I spent so much on something I eat out of boredom, so I looked around to get something else. And here they were, packaged and sold just like any other dried fruits or nuts: bitter apricot kernels. So I bought a 200 g bag.

Back in the office, I opened the bag up and I started eating. They were bitter as hell, but I didn't mind it much. I was eating some of them, then switching to candied ginger (which I'd absolutely love if it weren't so sweet), then back again. After a while, though, I'd had enough. About half of the bag in, I couldn't really find a reason to keep eating them. My colleagues had all refused to eat (and spit) more than half of one. But I was curious what they were actually for. People who love bitter tastes, maybe?

So went on the Internet and KABOOOM! mind blown. Just for scale, try to look for yourself at the dimensions of the can of worms I'd just opened: apricot kernels.

Turns out that the "active ingredient" in the apricot kernels is amygdalin, a substance that turns to cyanide in the gut. Yes, you've heard that right: I had just bitten the tooth, dying for the motherland before I could spill the beans. Google had already failed miserably, by serving first a page that explained how Big Pharma and governments conspired to keep this wonder drug from the public. The second page was Wikipedia, then every single conspiracy nut site, sprinkled with the occasional very dry scientific study that bottom lined at "we don't really know".

But I am getting ahead of myself. At this point I was already severely biased and I first need to describe my earnest experience to you. Short story: accelerated heartbeat, fever, terrible headache and nausea that lasted for half a day. Also, didn't die, which was good.

Back to my rant. So, some guy looked at the chemical structure of amygdalin and thought it looked like a B complex vitamin, so he named it vitamin B17. It was quickly marketed as a cure for cancer, despite numerous trials to show that it wasn't. And no, it's not a vitamin for humans either. It is not made in the human body, but it's not needed, either. The bag was not labeled anything dangerous, because it came from the outside of the European Union, which has a law regarding this. Here is some advice for both the EU and the US. Turkey was OK, though, so it only said "great for cancer, eat 5 to 8 seeds daily, not all at once".

So how fucked was I after eating about one hundred of them? A European Food Safety Authority article said that eating three kernels exceeds the safe level for adults. A toddler could do that from just eating one. An article from Cancer Council Australia detailed the child fatalities due to ingesting apricot seeds. Another article was telling me of an adult who got poisoning, but he was both stupid and extreme (he was taking a concentrated extract) and didn't die anyway. A thousand other sites were telling me how amazing my health will be after I had just eaten ten times the daily dosage they suggested.

Drowned in the sea of controversy regarding apricot kernels I've decided to look for the chemical and medicinal treatment for cyanide poisoning. Step 1: decontamination. It was kind of too late to go to the toilet and do the anorexia thing. Step 2: take some amyl nitrite (and then some intravenous things). Wait, that's a party drug. I could maybe get one in a sex shop. There was no home remedy and most of all, even if the amyl nitrite seems to work, no one seems to know exactly why other than the vasodilating effect it obviously has. Another possible antidote is (ironically) hydroxocobalamin, also called vitamin B12a. In the end some vitamin C and a headache pill did wonders, just in case you eat a bunch of apricot kernels and feel awful. Obviously, if it were a serious condition I would have died at the keyboard, trying to wade through the marketing posts and the uselessly dry official reports. Also, not enough easily available party drugs, I dare say.

So, days later the bout of shaky hands, fever and the horrible headache that only blood oxygen deprivation can bring, I decided to write this post. I doubt people will find it with Google, but maybe just my immediate friends will know not to eat this crap.

New Scientist is a science oriented news site that has existed in my periodic reading list for years. They had great content, seemingly unbiased and a good web site structure. But they went greedy. Instead of one in ten articles being "premium" now almost all articles I want to read are behind a pay wall. While I appreciate their content, I will never pay for it, especially when similar (and recently, event better) content can be found on phys.org or arstechnica.com completely free. So, I feel sad, but I need to remove New Scientist from my reading list. I understand there is an effort in what they do and that quality requires investment and cost, but brutally switching from an almost free format to a spammy pay wall is unacceptable for me.

I have invented a new way to write software when people who hold decision power are not available. It's called Flag Assisted Programming and it goes like this: whenever you have a question on how to proceed with your development, instead of bothering decision makers, add a flag to the configuration that determines which way to go. Then estimate for all the possible answers to your question and implement them all. This way, management not only has more time to do real work, but also the ability to go back and forth on their decisions as they see fit. Bonus points, FAPing allows middle management to say you have A/B testing at least partially implemented, and that you work in a very agile environment.

Finally, finally there is a TV series about an asteroid coming towards Earth and what we are going to do about it. It is called Salvation and it fails in every single respect.

The first alarm bell was Jennifer Finnigan, the female costar from Tyrant. She was terribly annoying in that show where she posed as the voice of reason and common sense, while being a nagging and demanding wife to the ruler of a foreign country. I thought "shame on you, Siderite! Just because she was like that in that show it's no reason to hold it against the actress". In Salvation, she plays the annoying nagging and demanding voice of reason and common sense as girlfriend to the secretary of the DOD.

But that's the least of the problems of the show. The idea is that a brilliant MIT student figures out there is an asteroid coming towards Earth. He tells his professor, who then calls someone and then promptly disappears, with goons watching his house. Desperate, he finds a way to reach to an Elon Musk wannabe and tell him the story. Backed by this powerful billionaire, he then contacts the government, which, surprise!, knew all about it and already had a plan. Which fails. Time to bring in the brilliant solution of the people who care: the EM drive! For which there is a need of exactly two billion dollars and one hundred kilograms of refined uranium. And that's just episode 2.

The only moment we actually see the asteroid is in a 3D holographic video projection, coming from most likely a text data file output of a tool an MIT student would build. Somehow that turns into a 3D rendering on the laptop of the billionaire. Not only does it crash into Earth, but it shows the devastation on the planet as a fire front. Really?

Bottom line: imagine something like Madam Secretary which somehow mated with the pilot episode of the X-Files reboot. Only low budget and boring as hell. There is no science, no real plot, no sympathetic characters, nothing but artificial drama which one would imagine to be pointless in a show about the end of the world, and ridiculously beautiful people acting with the skill of underwear models (Mark Wahlberg excluded, of course). Avoid it at all costs.

Update: oh, in episode 3, the last one I will watch, they send a probe to impact the asteroid and they do it like: "OK, we have a go from the president!" And in the next minute they watch (in real time from Io and from a front camera on the probe) how it is heading towards the asteroid. I mean... why write a story and not make it use anything real? What's the point in that? Even superhero movies are more realistic than this disaster. I know the creators of the show did other masterpieces such as Extant and Scream: The TV Series and Hawaii Five-O, they don't know any better, but at least they could have tried to improve just a tiny sliver. Instead they shat on our TV screens.

Fuck Java! Just fuck it! I have been trying for half an hour to understand why a NullPointerException is returned in a Java code that I can't debug. It was a simple String object that was null inside a switch statement. According to this link states that The prohibition against using null as a switch label prevents one from writing code that can never be executed. If the switch expression is of a reference type, that is, String or a boxed primitive type or an enum type, then a run-time error will occur if the expression evaluates to null at run time.

The Mist is a novella by Stephen King that has been adapted into a great horror movie. I mean, I rated the movie 10 out of 10 stars. So when the TV show The Mist came around I was ecstatic. And then... I watched three episodes of the most insipid and obnoxious series I've seen since Under The Dome. Nay, since Fear of the Walking Dead.

Imagine they removed most of the monsters and replaced them with mostly insects, then they enhanced everything else: small town politics, family matters, teenagers, etc. OK, the original Mist was great because it showed the greatest ugliness was not the interdimensional creatures, but the pettiness of humans. However, it was the right balance between the two. Now, in a TV show that censors words like "fuck", you get to see teenage angst, drug rape, power hungry egotistic policemen, one of the most beautiful actresses from Vikings relegated to the role of an overprotective mother, husband and wife interactions - lots of those, junkies, amnesiac soldiers, priests, goth kids, nature freaks, old people... oh, the humanity! Three episodes in which nothing happened other than exposition, introduction of lots of characters no one cares for and that's about it.

I am tired. I really am tired of hearing that price is driven by offer and demand - which is quite true because that's the definition of price, it has nothing to do with actual value. Same with stories: they are all about people, because people care about people and most people are people. No need for anything too exotic when all you need to do to please most people is to show them other most people. Grand from a marketing point of view, but quite pointless overall, I would say. But who's gonna listen to me, I am not most people after all.

Bottom line: lately there has been a lot of effort invested into TV. HBO and Netflix have led the way by caring about their productions enough to make them rival and even beat not only film productions, but also the original literary material. This has led me to hope against hope that The Mist will be the best horror TV show out there, one that would maybe last two or three seasons at most, but burn a bright light. Instead it is a dying fire that wasn't properly lit and is probably going to take two or three seasons just to properly die out without anyone noticing it is gone, yet managing to poison the legacy of the film forever.

As you probably know, whenever I blog something, an automated process sends a post to Facebook and one to Twitter. As a result, some people comment on the blog, some on Facebook or Twitter, but more often someone "likes" my blog post. Don't get me wrong, I appreciate the sentiment, but it is quite meaningless. Why did you like it? Was it well written, well researched, did you find it useful and if so in what way? I would wager that most of the time the feeling is not really that clear cut, either. Maybe you liked most of the article, but then you absolutely hated a paragraph. What should you do then? Like it a bunch of times and hate it once?

This idea that people should express emotion related to someone else's content is not only really really stupid, it is damaging. Why? I am glad you asked - clearly you already understand the gist of my article and have decided to express your desire for knowledge over some inevitable sense of awe and gratitude. Because if it is natural for people to express their emotions related to your work, then that means you have to accept some responsibility for what they get to feel, and then you fall into the political correctness, safe zone, don't do anything for someone might get hurt pile of shit. Instead, accept the fact that sharing knowledge or even expressing an opinion is nothing more than a data signal that people may or may not use. Don't even get me started on that "why didn't you like my post? was it something wrong with it? Are you angry with me?" insecurity bullshit that may be cute coming from a 12 year old, but it's really creepy with 50 year old people.

Back to my amazing blog posts, I am really glad you like them. You make my day. I am glowing and I am filled with a sense of happiness that is almost impossible to describe. And then I start to think, and it all goes away. Why did you like it, I wonder? Is it because you feel obligated to like stuff when your friends post? Is it some kind of mercy like? Or did you really enjoy part of the post? Which one was it? Maybe I should reread it and see if I missed something. Mystery like! Nay, more! It is a riddle, wrapped in a mystery, inside an enigma; but perhaps there is a key. That key is personal interest in providing me with useful feedback, the only way you can actually help me improve content.

Let me reiterate this as clear as I possibly can: the worse thing you can do is try to spare my feelings. First of all, it is hubris to believe you have any influence on them at all. Second, you are not skilled enough to understand in what direction your actions would influence them anyway. And third, feeling is the stuff that fixates memories, but you have to have some memory to fixate first! Don't sell a lifetime of knowing something on a few seconds of feeling gratified by some little smiley or bloody heart.

And then there is another reason, maybe one that is more important than everything I have written here. When you make the effort of summarizing what you have read in order to express an opinion you retrieve and generate knowledge in your own head, meaning you will remember it better and it will be more useful to you.

So fuck your wonderful emotions! Give me your thoughts and knowledge instead.

Today I received two DMCA notices. One of them might have been true, but the second was for a file which started with
/*
Copyright (c) 2010, Yahoo! Inc. All rights reserved.
Code licensed under the BSD License:
http://developer.yahoo.com/yui/license.html
version: 2.8.1
*/
Nice, huh?

The funny part is that these are files on my Google Drive, which are not used anywhere anymore and are accessible only by people with a direct link to them. Well, I removed the sharing on them, just in case. The DMCA is even more horrid than I thought. The links in it are general links towards a search engine for notices (not the link to the actual notice) and some legalese documents, the email it is coming from is noreply-6b094097@google.com and any hope that I might fight this is quashed with clear intention from the way the document is worded.

So remember: Google Drive is not yours, it's Google's. I wonder if I would have gotten the DMCA even if the file was not being shared. There is a high chance I would, since no one should be using the link directly.

Bleah, lawyers!

I have enabled Disqus comments on this blog and it is supposed to work like this: every old comment from Blogger has to be imported into Disqus and every new comment from Disqus needs to be also saved in the Blogger system. Importing works just fine, but "syncing" does not. Every time someone posts a comment I receive this email:
Hi siderite,
 
You are receiving this email because you've chosen to sync your
comments on Disqus with your Blogger blog. Unfortunately, we were not
able to access this blog.
 
This may happen if you've revoked access to Disqus. To re-enable,
please visit:
https://siderite.disqus.com/admin/discussions/import/platform/blogger/
 
Thanks,
The Disqus Team
Of course, I have not revoked any access, but I "reenable" just the same only to be presented with a link to resync that doesn't work. I mean, it is so crappy that it returns the javascript error "e._ajax is undefined" for a line where e._ajax is used instead of e.ajax and even if that would have worked, it uses a config object that is not defined.

It doesn't really matter, because the ajax call just accesses (well, it should access) https://siderite.disqus.com/admin/discussions/import/platform/blogger/resync/. And guess what happens when I go there: I receive an email that the Disqus access in Blogger has been revoked.

No reply for the Disqus team for months, for me or anybody else having this problem. They have a silly page that explains that, of course, they are not at fault, Blogger did some refactoring and broke their system. Yeah, I believe that. They probably renamed the ajax function in jQuery as well. Damn Google!

I am going to discuss in this post an interview question that pops up from time to time. The solution that is usually presented as best is the same, regardless of the inputs. I believe this to be a mistake. Let me explore this with you.

The problem



The problem is simple: given two sorted arrays of very large size, find the most efficient way to compute their intersection (the list of common items in both).

The solution that is given as correct is described here (you will have to excuse its Javiness), for example. The person who provided the answer made a great effort to list various solutions and list their O complexity and the answer inspires confidence, as coming from one who knows what they are talking about. But how correct is it? Another blog post describing the problem and hinting on some extra information that might influence the result is here.

Implementation


Let's start with some code:
var rnd = new Random();
var n = 100000000;
int[] arr1, arr2;
generateArrays(rnd, n, out arr1, out arr2);
var sw = new Stopwatch();
sw.Start();
var count = intersect(arr1, arr2).Count();
sw.Stop();
Console.WriteLine($"{count} intersections in {sw.ElapsedMilliseconds}ms");
Here I am creating two arrays of size n, using a generateArrays method, then I am counting the number of intersections and displaying the time elapsed. In the intersect method I will also count the number of comparisons, so that we avoid for now the complexities of Big O notation (pardon the pun).

As for the generateArrays method, I will use a simple incremented value to make sure the values are sorted, but also randomly generated:
private static void generateArrays(Random rnd, int n, out int[] arr1, out int[] arr2)
{
arr1 = new int[n];
arr2 = new int[n];
int s1 = 0;
int s2 = 0;
for (var i = 0; i < n; i++)
{
s1 += rnd.Next(1, 100);
arr1[i] = s1;
s2 += rnd.Next(1, 100);
arr2[i] = s2;
}
}

Note that n is 1e+7, so that the values fit into an integer. If you try a larger value it will overflow and result in negative values, so the array would not be sorted.

Time to explore ways of intersecting the arrays. Let's start with the recommended implementation:
private static IEnumerable<int> intersect(int[] arr1, int[] arr2)
{
var p1 = 0;
var p2 = 0;
var comparisons = 0;
while (p1<arr1.Length && p2<arr2.Length)
{
var v1 = arr1[p1];
var v2 = arr2[p2];
comparisons++;
switch(v1.CompareTo(v2))
{
case -1:
p1++;
break;
case 0:
p1++;
p2++;
yield return v1;
break;
case 1:
p2++;
break;
}

}
Console.WriteLine($"{comparisons} comparisons");
}

Note that I am not counting the comparisons of the two pointers p1 and p2 with the Length of the arrays, which can be optimized by caching the length. They are just as resource using as comparing the array values, yet we discount them in the name of calculating a fictitious growth rate complexity. I am going to do that in the future as well. The optimization of the code itself is not part of the post.

Running the code I get the following output:
19797934 comparisons
199292 intersections in 832ms

The number of comparisons is directly proportional with the value of n, approximately 2n. That is because we look for all the values in both arrays. If we populate the values with odd and even numbers, for example, so no intersections, the number of comparisons will be exactly 2n.

Experiments


Now let me change the intersect method, make it more general:
private static IEnumerable<int> intersect(int[] arr1, int[] arr2)
{
var p1 = 0;
var p2 = 0;
var comparisons = 0;
while (p1 < arr1.Length && p2 < arr2.Length)
{
var v1 = arr1[p1];
var v2 = arr2[p2];
comparisons++;
switch (v1.CompareTo(v2))
{
case -1:
p1 = findIndex(arr1, v2, p1, ref comparisons);
break;
case 0:
p1++;
p2++;
yield return v1;
break;
case 1:
p2 = findIndex(arr2, v1, p2, ref comparisons);
break;
}

}
Console.WriteLine($"{comparisons} comparisons");
}

private static int findIndex(int[] arr, int v, int p, ref int comparisons)
{
p++;
while (p < arr.Length)
{
comparisons++;
if (arr[p] >= v) break;
p++;
}
return p;
}
Here I've replaced the increment of the pointers with a findIndex method that keeps incrementing the value of the pointer until the end of the array is reached or a value larger or equal with the one we are searching for was found. The functionality of the method remains the same, since the same effect would have been achieved by the main loop. But now we are free to try to tweak the findIndex method to obtain better results. But before we do that, I am going to P-hack the shit out of this science and generate the arrays differently.

Here is a method of generating two arrays that are different because all of the elements of the first are smaller than the those of the second. At the very end we put a single element that is equal, for the fun of it.
private static void generateArrays(Random rnd, int n, out int[] arr1, out int[] arr2)
{
arr1 = new int[n];
arr2 = new int[n];
for (var i = 0; i < n - 1; i++)
{
arr1[i] = i;
arr2[i] = i + n;
}
arr1[n - 1] = n * 3;
arr2[n - 1] = n * 3;
}

This is the worst case scenario for the algorithm and the value of comparisons is promptly 2n. But what if we would use binary search (what in the StackOverflow answer was dismissed as having O(n*log n) complexity instead of O(n)?) Well, then... the output becomes
49 comparisons
1 intersections in 67ms
Here is the code for the findIndex method that would do that:
private static int findIndex(int[] arr, int v, int p, ref int comparisons)
{
var start = p + 1;
var end = arr.Length - 1;
if (start > end) return start;
while (true)
{
var mid = (start + end) / 2;
var val = arr[mid];
if (mid == start)
{
comparisons++;
return val < v ? mid + 1 : mid;
}
comparisons++;
switch (val.CompareTo(v))
{
case -1:
start = mid + 1;
break;
case 0:
return mid;
case 1:
end = mid - 1;
break;
}
}
}

49 comparisons is smack on the value of 2*log2(n). Yeah, sure, the data we used was doctored, so let's return to the randomly generated one. In that case, the number of comparisons grows horribly:
304091112 comparisons
199712 intersections in 5095ms
which is larger than n*log2(n).

Why does that happen? Because in the randomly generated data the binary search find its worst case scenario: trying to find the first value. It divides the problem efficiently, but it still has to go through all the data to reach the first element. Surely we can't use this for a general scenario, even if it is fantastic for one specific case. And here is my qualm with the O notation: without specifying the type of input, the solution is just probabilistically the best. Is it?

Let's compare the results so far. We have three ways of generating data: randomly with increments from 1 to 100, odds and evens, small and large values. Then we have two ways of computing the next index to compare: linear and binary search. The approximate numbers of comparisons are as follows:
RandomOddsEvensSmallLarge
Linear2n2n2n
Binary search3/2*n*log(n)2*n*log(n)2*log(n)

Alternatives


Can we create a hybrid findIndex that would have the best of both worlds? I will certainly try. Here is one possible solution:
private static int findIndex(int[] arr, int v, int p, ref int comparisons)
{
var inc = 1;
while (true)
{
if (p + inc >= arr.Length) inc = 1;
if (p + inc >= arr.Length) return arr.Length;
comparisons++;
switch(arr[p+inc].CompareTo(v))
{
case -1:
p += inc;
inc *= 2;
break;
case 0:
return p + inc;
case 1:
if (inc == 1) return p + inc;
inc /= 2;
break;
}
}
}

What am I doing here? If I find the value, I return the index; if the value is smaller, not only do I advance the index, but I also increase the speed of the next advance; if the value is larger, then I slow down until I get to 1 again. Warning: I do not claim that this is the optimal algorithm, this is just something that was annoying me and I had to explore it.

OK. Let's see some results. I will decrease the value of n even more, to a million. Then I will generate the values with random increases of up to 10, 100 and 1000. Let's see all of it in action! This time is the actual count of comparisons (in millions):
Random10Random100Random1000OddsEvensSmallLarge
Linear22222
Binary search303030400.00004
Accelerated search3.43.93.940.0002

So for the general cases, the increase in comparisons is at most twice, while for specific cases the decrease can be four orders of magnitude!

Conclusions


Because I had all of this in my head, I made a fool of myself at a job interview. I couldn't reason all of the things I wrote here in a few minutes and so I had to clear my head by composing this long monstrosity.

Is the best solution the one in O(n)? Most of the time. The algorithm is simple, no hidden comparisons, one can understand why it would be universally touted as a good solution. But it's not the best in every case. I have demonstrated here that I can minimize the extra comparisons in standard scenarios and get immense improvements for specific inputs, like arrays that have chunks of elements smaller than the next value in the other array. I would also risk saying that this findIndex version is adaptive to the conditions at hand with improbable scenarios as worst cases. It works reasonable well for normally distributed arrays, it does wonders for "chunky" arrays (in this is included the case when one array is much smaller than the other) and thus is a contender for some kinds of uses.

What I wanted to explore and now express is that finding the upper growth rate of an algorithm is just part of the story. Sometimes the best implementation fails for not adapting to the real input data. I will say this, though, for the default algorithm: it works with IEnumerables, since it never needs to jump forward over some elements. This intuitively gives me reason to believe that it could be optimized using the array/list indexing. Here it is, in IEnumerable fashion:
private static IEnumerable<int> intersect(IEnumerable<int> arr1, IEnumerable<int> arr2)
{
var e1 = arr1.GetEnumerator();
var e2 = arr2.GetEnumerator();
var loop = e1.MoveNext() && e2.MoveNext();
while (loop)
{
var v1 = e1.Current;
var v2 = e2.Current;
switch (v1.CompareTo(v2))
{
case -1:
loop = e1.MoveNext();
break;
case 0:
loop = e1.MoveNext() && e2.MoveNext();
yield return v1;
break;
case 1:
loop = e2.MoveNext();
break;
}

}
}

Extra work


The source code for a project that tests my various ideas can be found on GitHub. There you can find the following algorithms:
  • Standard - the O(m+n) one described above
  • Reverse - same, but starting from the end of the arrays
  • Binary Search - looks for values in the other array using binary search. Complexity O(m*log(n))
  • Smart Choice - when m*log(n)<m+n, it uses the binary search, otherwise the standard one
  • Accelerating - the one that speeds up when looking for values
  • Divide et Impera - recursive algorithm that splits arrays by choosing the middle value of one and binary searching it in the other. Due to the complexity of the recursiveness, it can't be taken seriously, but sometimes gives surprising results
  • Middle out - it takes the middle value of one array and binary searches it in the other, then uses Standard and Reverse on the resulting arrays
  • Pair search - I had high hopes for this, as it looks two positions in front instead of one. Really good for some cases, though generally it is a bit more than Standard

The testing tool takes all algorithms and runs them on randomly generated arrays:
  1. Lengths m and n are chosen randomly from 1 to 1e+6
  2. A random number s of up to 100 "spikes" is chosen
  3. m and n are split into s+1 equal parts
  4. For each spike a random integer range is chosen and filled with random integer values
  5. At the end, the rest of the list is filled with any random values

Results


For really small first array, the Binary Search is king. For equal size arrays, usually the Standard algorithm works wins. However there are plenty of cases when Divide et Impera and Pair Search win - usually not by much. Sometimes it happens that Accelerating Search is better than Standard, but Pair Search wins! I still have the nagging feeling that Pair Search can be improved. I feel it in my gut! However I have so many other things to do for me to dwell on this.

Maybe one of you can find the solution! Your mission, should you choose to accept it, is to find a better algorithm for intersecting sorted arrays than the boring standard one.