Run Whisper-large-v3 on Google Colab (T4)

0𝕏koji

9 min readNov 19, 2023

whisper

GitHub - openai/whisper: Robust Speech Recognition via Large-Scale Weak Supervision

Robust Speech Recognition via Large-Scale Weak Supervision - GitHub - openai/whisper: Robust Speech Recognition via…

github.com

As I mentioned in the title, the following used Google Colab GPU T4 which is free.
My jupyter notebook 👇

llm_on_GoogleColab/Whisper_large_v3.ipynb at main · koji/llm_on_GoogleColab

Contribute to koji/llm_on_GoogleColab development by creating an account on GitHub.

github.com

1 install openai-whisper

!pip install -U openai-whisper

2 install youtube-dl
If you have a mp3 that you want to try with whisper, you can skip this step.

I’m using a video which is about OpenAI’s breking news.
The following command is to download only sound from the video.

!pip install yt-dlp
!yt-dlp -x --audio-format mp3 https://www.youtube.com/watch?v=Rbl7qmTH6b8 -o test.mp3

3 run whisper

import whisper

model_size = "large-v3"

model = whisper.load_model(model_size)
result = model.transcribe("test.mp3")
print(result["text"])

segments = result["segments"]

for segment in segments:
    print("[%.2fs -> %.2fs] %s" % (segment['start'], segment['end'], segment['text']))

The result is the following. It’s pretty accurate. The process took less than 5 minutes for around 9 minutes sound file.

[0.00s -> 5.02s]  We have a lot more information about what went down yesterday with Sam Altman getting fired from OpenAI.
[5.32s -> 9.12s]  So let me give you a bunch of updates, but of course, we don't know everything yet.
[9.12s -> 14.40s]  So shortly after the board announced that Sam Altman was getting fired, Sam Altman posted on X,
[14.54s -> 19.40s]  I loved my time at OpenAI. It was transformative for me personally and hopefully the world a little bit.
[19.54s -> 22.66s]  Absolutely. Most of all, I loved working with such talented people.
[22.78s -> 25.22s]  We'll have more to say about what's next later.
[25.22s -> 31.66s]  Sam Altman is one of the best operators and founders and technologists in general of our generation.
[31.94s -> 38.32s]  And for him to unceremoniously get fired from the company that he's put his blood, sweat, and tears into over the years,
[38.42s -> 41.12s]  it hurts my heart to hear as a former founder myself.
[41.28s -> 44.84s]  And not only that, he was seemingly betrayed by two of his close colleagues.
[45.04s -> 48.10s]  And ultimately, the board is the one that fired him.
[48.24s -> 53.48s]  And funny enough, the board probably spends only one to two hours a month working on OpenAI.
[53.48s -> 54.98s]  And Sam Altman probably spends...
[55.22s -> 57.04s]  16 hours a day on it.
[57.12s -> 59.42s]  So it's really heartbreaking to see this.
[59.58s -> 64.68s]  And I was following Kara Swisher all night because she was reporting the news in near real time.
[64.90s -> 67.80s]  And it seems like a lot of what she was saying was really accurate.
[68.02s -> 74.94s]  And what seems to happen is Ilya Sutskovor and Mira Murady had a split with Sam Altman and Greg Brockman.
[75.08s -> 78.02s]  But why? Why did they do this? That's the important question.
[78.02s -> 85.18s]  From all the reporting I read, it seems to relate to the fact that Sam Altman and Greg Brockman wanted to move really,
[85.32s -> 88.54s]  fast. They wanted to release technology as quickly as they could.
[88.64s -> 90.74s]  And they wanted to make a lot of money doing it.
[90.82s -> 95.14s]  But OpenAI has this weird structure where they're a non-profit company.
[95.30s -> 100.46s]  But then they created this separate entity that could make profit, but it was still owned by the non-profit.
[100.70s -> 104.74s]  And nobody on the board really had a financial incentive aligned with the company.
[104.86s -> 106.02s]  They don't have shares.
[106.26s -> 108.28s]  And neither did Sam Altman, frankly.
[108.28s -> 113.12s]  So it's really just an odd structure for one of the most influential companies of our generation.
[113.60s -> 115.18s]  And Ilya, Mira, and the board...
[115.34s -> 117.34s]  seem to just get really nervous.
[117.34s -> 121.54s]  And Sam Altman seemed to be really headstrong and thought he could just make all the decisions.
[121.54s -> 123.90s]  But ultimately, the board makes the decisions.
[123.90s -> 125.90s]  The board can hire and fire CEOs.
[125.90s -> 126.94s]  That is what they do.
[126.94s -> 128.82s]  So let's take a look at what Kara Swisher said.
[128.82s -> 131.78s]  So it looks like Ilya Sutskovor said a few things about the situation.
[131.78s -> 132.44s]  Let's read it.
[132.44s -> 135.38s]  You can call it this way, Sutskovor said about the coup allegations.
[135.38s -> 137.48s]  Of course, a lot of people are calling it a coup.
[137.48s -> 139.68s]  And I can understand why you would choose this word.
[139.68s -> 141.06s]  But I disagree with this.
[141.06s -> 145.02s]  This was the board doing its duty to the mission of the non-profit.
[145.02s -> 145.18s]  Which is the board doing its duty to the mission of the non-profit.
[145.18s -> 145.20s]  Which is the board doing its duty to the mission of the non-profit.
[145.22s -> 148.82s]  Which is to make sure that OpenAI builds AGI that benefits all of humanity.
[148.82s -> 154.02s]  When Sutskovor was asked whether these backroom removals are a good way to govern the most important company in the world,
[154.02s -> 154.62s]  he answered,
[154.62s -> 155.74s]  I mean, fair.
[155.74s -> 158.50s]  I agree that there is not an ideal element to it.
[158.50s -> 159.42s]  100%.
[159.42s -> 160.98s]  And here's the crazy part.
[160.98s -> 162.78s]  Basically, nobody knew.
[162.78s -> 166.70s]  Kara Swisher reported that Sam Altman found out 30 minutes in advance.
[166.70s -> 169.34s]  Greg Brockman, 5 minutes in advance.
[169.34s -> 172.18s]  And Greg Brockman was the chair of the board.
[172.18s -> 175.02s]  So they really made a move on him.
[175.02s -> 179.62s]  And Microsoft, who owns about 50%, was told just before it all went down.
[179.62s -> 183.62s]  So it's really just a handful of people who made this incredible move.
[183.62s -> 188.14s]  And yeah, she used a clown in a car because yes, this is a clown car right now.
[188.14s -> 191.92s]  And Kara also reported at 8.32 PM last night,
[191.92s -> 198.02s]  more of the board members who voted against Altman felt he was manipulative and headstrong and wanted to do what he wanted to do.
[198.02s -> 204.22s]  That sounds like the typical SV Silicon Valley CEO to me, but this might not be the typical SV company.
[204.22s -> 204.82s]  And yeah,
[204.82s -> 213.42s]  these are extremely aggressive, extremely bright people who are advancing the most important technology of our time.
[213.42s -> 217.82s]  But the structure of open AI is non-typical for Silicon Valley.
[217.82s -> 219.12s]  And then she follows up with,
[219.12s -> 227.72s]  would be eager to hear the actual specifics of their concerns and also evidence that they tried to inform him if they had problems and gave him a chance to respond and change.
[227.72s -> 229.32s]  If not, looks clottish.
[229.32s -> 231.72s]  So that is a really good point.
[231.72s -> 234.02s]  Specifically in the blog post by the board,
[234.02s -> 237.32s]  they said that Sam Altman was not consistently candid with the board.
[237.32s -> 239.22s]  That means he was lying to the board.
[239.22s -> 242.12s]  And so they're probably going to have to put out evidence soon.
[242.12s -> 244.42s]  Otherwise, they're going to look really foolish.
[244.42s -> 249.42s]  Then at 8.42 PM last night, Greg Brockman put out his statement.
[249.42s -> 252.52s]  Sam and I are shocked and saddened by what the board did today.
[252.52s -> 255.32s]  So Sam and Greg are completely aligned with this.
[255.32s -> 258.92s]  And I wouldn't be surprised if they announced a new company as soon as Monday.
[258.92s -> 262.82s]  Let us first say thank you to all the incredible people who we have worked with at open AI.
[262.82s -> 264.02s]  Our customers, our investors,
[264.02s -> 265.82s]  and all of those who have been reaching out.
[265.82s -> 268.22s]  We too are still trying to figure out exactly what happened.
[268.22s -> 269.22s]  Here is what we know.
[269.22s -> 274.02s]  It's really shocking how this went down and almost nobody knew what was going on.
[274.02s -> 278.82s]  History might see this as one of the worst board decisions of all time.
[278.82s -> 282.42s]  Last night, Sam got a text from Ilya asking to talk at noon Friday.
[282.42s -> 284.82s]  Sam joined the Google Meet and the whole board,
[284.82s -> 286.82s]  except Greg, was there.
[286.82s -> 290.22s]  Now, it's interesting that he says except Greg when he's talking about himself,
[290.22s -> 291.12s]  but fine.
[291.12s -> 293.42s]  Ilya told Sam he was being fired,
[293.42s -> 295.22s]  and that the news was going out very soon.
[295.22s -> 299.02s]  At 1219, Greg got a text from Ilya asking for a quick call.
[299.02s -> 302.22s]  At 1223, Ilya sent a Google Meet link.
[302.22s -> 304.12s]  Greg was told that he's being removed from the board,
[304.12s -> 306.92s]  but was vital to the company and would retain his role,
[306.92s -> 308.42s]  and that Sam had been fired.
[308.42s -> 310.72s]  Around the same time, open AI published a blog post.
[310.72s -> 314.42s]  The fact that the board thought that they could fire Sam Altman,
[314.42s -> 315.92s]  demote Greg Brockman,
[315.92s -> 319.42s]  but still keep him at the company seems absurd to me.
[319.42s -> 321.42s]  I have no idea what they were thinking.
[321.42s -> 322.12s]  As far as we know,
[322.12s -> 323.22s]  the management team was made,
[323.22s -> 324.92s]  and beware of this shortly after,
[324.92s -> 327.52s]  other than Mira, who found out the night prior.
[327.52s -> 328.62s]  So it's interesting.
[328.62s -> 332.02s]  She knew, but she only found out the night prior.
[332.02s -> 335.42s]  So it looks like she wasn't involved in the actual decision,
[335.42s -> 338.62s]  but she definitely sided with the board's decision on this.
[338.62s -> 340.42s]  The outpouring of support has been really nice.
[340.42s -> 342.42s]  Thank you, but please don't spend any time being concerned.
[342.42s -> 343.12s]  We will be fine.
[343.12s -> 344.22s]  Greater things coming soon.
[344.22s -> 346.92s]  Of course, these are two of the greatest founders of all time,
[346.92s -> 348.72s]  and they're going to be totally fine.
[348.72s -> 353.02s]  And the amount of posts from founders that I've read about Sam Altman helping them
[353.02s -> 356.92s]  throughout his career and their career has been really amazing.
[356.92s -> 358.92s]  Sam Altman is well liked,
[358.92s -> 363.62s]  well respected through Silicon Valley and through the entire world of technology.
[363.62s -> 365.92s]  Sam Altman at 9 0 5 p.m.
[365.92s -> 366.92s]  Put out a post.
[366.92s -> 367.82s]  I love you all.
[367.82s -> 369.72s]  Today was a weird experience in many ways,
[369.72s -> 374.12s]  but one unexpected one is that it has been sort of like reading your own eulogy
[374.12s -> 375.12s]  while you're still alive.
[375.12s -> 377.02s]  The outpouring of love is awesome.
[377.02s -> 380.02s]  One takeaway, go tell your friends how great you think they are.
[380.02s -> 380.72s]  I agree.
[380.72s -> 383.02s]  And then I just want to show this reply to that.
[383.02s -> 385.72s]  Tweet run it back brother for praying for exits.
[385.72s -> 393.62s]  This is a reference to a curb your enthusiasm episode where Larry David builds a coffee shop next to a coffee shop that did him wrong.
[393.62s -> 395.02s]  And it was a spite coffee shop.
[395.02s -> 396.72s]  So he built it completely out of spite.
[396.72s -> 402.82s]  And this is just such an incredible reference and Sam Altman needs to go build an AI company purely out of spite.
[402.82s -> 406.52s]  And then here's the weird one 9 32 p.m.
[406.52s -> 411.82s]  If I start going off the open AI board should go after me for the full value of my shares.
[411.82s -> 412.02s]  Now,
[412.02s -> 412.62s]  this is a
[413.02s -> 416.82s]  complete troll because Sam Altman has already mentioned in the past.
[416.82s -> 419.82s]  He basically has no financial incentive in open AI.
[419.82s -> 421.42s]  He has no shares or the shares.
[421.42s -> 424.12s]  He does have are worth very little to nothing.
[424.12s -> 425.22s]  So he's basically like,
[425.22s -> 425.82s]  all right,
[425.82s -> 426.72s]  I'm out now.
[426.72s -> 430.42s]  I can say whatever I want because you literally can't take any of my shares back.
[430.42s -> 431.52s]  They're not worth anything.
[431.52s -> 436.12s]  And then late last night Killian has reported at 1245 a.m.
[436.12s -> 441.42s]  More senior departures from open AI tonight GPT for lead director of research.
[441.42s -> 442.82s]  Jacob Hachaki,
[442.82s -> 447.32s]  head of AI risk Alexander Madri open-source baselines researcher,
[447.32s -> 451.62s]  Sisman Sayed or and Sam Altman Greg Brockman and that's just day one.
[451.62s -> 452.02s]  And yeah,
[452.02s -> 460.72s]  if Sam Altman is truly liked and respected within open AI as a lot of people are reporting and that is clear in the broader industry.
[460.72s -> 462.52s]  A lot of people are going to follow him.
[462.52s -> 465.82s]  So still a lot going on still a lot of information coming out,
[465.82s -> 468.02s]  but that's what we know definitively for now.
[468.02s -> 468.92s]  If you liked this video,
[468.92s -> 472.02s]  please consider giving a like and subscribe and I'll see you in the next one.

Run Whisper-large-v3 on Google Colab (T4)

GitHub - openai/whisper: Robust Speech Recognition via Large-Scale Weak Supervision

Robust Speech Recognition via Large-Scale Weak Supervision - GitHub - openai/whisper: Robust Speech Recognition via…

llm_on_GoogleColab/Whisper_large_v3.ipynb at main · koji/llm_on_GoogleColab

Contribute to koji/llm_on_GoogleColab development by creating an account on GitHub.

Written by 0𝕏koji