WEBVTT

1
00:00:03.000 --> 00:00:09.420
Josh Moore: I think we're good. Okay, I'll share my screen for just a second. I want to make sure everyone knows how are using the hacking D document.

2
00:00:10.469 --> 00:00:11.340
Josh Moore: Um,

3
00:00:16.619 --> 00:00:31.860
Josh Moore: So the links in the chat. I'm at the bottom of the file. You can see all the notes from the previous conversation. So the European and Asian conversation, plus a couple of crazy East Coasters um they get lots of credit

4
00:00:33.450 --> 00:00:46.800
Josh Moore: So we'll be filling out this session. I've left the topics that were discussed in the morning. So if anyone wants to recap, we can go. We can talk about those. Put your name beside them and then we'll start to order the topics by by interest. So the more

5
00:00:51.360 --> 00:00:54.060
Josh Moore: The more names that are besides something then

6
00:00:55.320 --> 00:01:03.720
Josh Moore: The mortal bubble to the top. If you have questions, and then you want to talk about anything you want to share, by all means, um,

7
00:01:05.340 --> 00:01:10.050
Josh Moore: Alright, so that's probably enough for me. Before we dive into things so

8
00:01:11.670 --> 00:01:25.650
Josh Moore: who showed up. Alright, so go around the room, my participants list will do the non dundonians first alphabetically, you can start to prepare yourself.

9
00:01:27.870 --> 00:01:43.410
Josh Moore: Is mark at the top of the list. I assume that all change. So I think Bill you're at the top, you want to take about a minute to introduce yourself. Someone yeah said has a clock. He will let you know when you're running over but this went pretty quickly. Last time, so go for

10
00:01:45.420 --> 00:02:00.840
Bill Katz: Me. Yes. Yeah. So, I'm Bill cats. I'm with Jeanette Leah, the fly em team I designed David and I'm working pretty extensively on data stores matches to image.

11
00:02:01.440 --> 00:02:19.680
Bill Katz: Formats or segmentation. But basically, a little bit more general. In terms of data types. But the key thing. I'm also very interested in is branch burgeoning in creating a GitHub like way of starting to do data sharing among you know various image formats, etc.

12
00:02:25.440 --> 00:02:25.860
Josh Moore: That arena.

13
00:02:27.180 --> 00:02:28.680
Bill Katz: That's it. Thanks.

14
00:02:31.170 --> 00:02:35.460
Caterina Strambio: Hello, Katrina stranded Castilleja I am here.

15
00:02:36.510 --> 00:02:44.100
Caterina Strambio: Primarily, to make sure that we are, I'm working primarily on the metadata side microscopy meta data.

16
00:02:45.360 --> 00:02:47.400
Caterina Strambio: To extend the Omi model.

17
00:02:48.420 --> 00:03:08.340
Caterina Strambio: And I'm here, primarily to make sure that what we're doing, you know, it matches what you guys. I mean, what the the data container will will have because obviously the two things are very connected that is going to be as a different community call about the metadata aspects.

18
00:03:10.110 --> 00:03:10.440
Caterina Strambio: Thank you.

19
00:03:12.330 --> 00:03:12.600
Josh Moore: I'm here.

20
00:03:16.830 --> 00:03:16.980
Damir Sudar: I'm

21
00:03:18.360 --> 00:03:25.380
Damir Sudar: Working for a small company called Quantitative imaging systems and closely associated with Oregon Health and Science University important

22
00:03:26.100 --> 00:03:44.850
Damir Sudar: My interest. This primarily it's very similar to cover pettiness, because we're on the working group trying to come up with that metadata standards called set. And I'm also very interested in the data container site because a lot of our data is in

23
00:03:45.960 --> 00:03:57.210
Damir Sudar: Formats right now that are difficult to handle. And so a new the new way of storing pixel data is desperately needed lovely part of that.

24
00:03:58.950 --> 00:03:59.310
Thanks.

25
00:04:02.640 --> 00:04:03.030
Josh Moore: David

26
00:04:06.000 --> 00:04:14.670
David Pinto: Hi, I'm David from micro University of Oxford. I've been here this morning. A similar to discussions going to be a good evening and

27
00:04:15.180 --> 00:04:24.060
David Pinto: I've mostly worker at the moment in acquisition software for microscopes. So I'm interested. How can you know use these not only to save data.

28
00:04:24.420 --> 00:04:35.430
David Pinto: In, you know, recognize you for much, and as well, like the meta data since our focuses is, you know, quite exotic microscopes and these devices. So unless you can make use of these tools.

29
00:04:40.290 --> 00:04:40.740
Josh Moore: Davis.

30
00:04:41.370 --> 00:04:46.650
Davis Bennett: Hey, spin it, I'm engineering a research campus and I work with a project team that is

31
00:04:48.720 --> 00:04:57.300
Davis Bennett: Really big fix them data sets and generating tons of extra derived volumes from those and then moving them around three and making them all shareable

32
00:04:58.350 --> 00:05:01.770
Davis Bennett: So I have strong opinions about multi scale metadata.

33
00:05:02.910 --> 00:05:12.270
Davis Bennett: And, you know, all kinds of image container issues, kind of like sheet microscopy. So my data has no time access now, but I'm axis.

34
00:05:16.950 --> 00:05:17.310
Josh Moore: Here in

35
00:05:18.660 --> 00:05:31.170
Eric Perlman: America work. Previously on in a softball with TM and various large volumes of data. So I care about multi scale for people to access it, and most recently have been working with

36
00:05:32.730 --> 00:05:35.820
Eric Perlman: Jackson Lab on us arranging slides.

37
00:05:37.830 --> 00:05:42.870
Eric Perlman: And yes, I am here for the second session but I figured out, mostly sit silent unfortunately couldn't hide from Josh

38
00:05:47.730 --> 00:05:49.410
Josh Moore: More Eric's yeah

39
00:05:49.560 --> 00:05:54.870
Eric Wait: My name is Eric. Wait, I'm at the advanced imaging center edge anemia and my interest is getting

40
00:05:56.160 --> 00:06:09.390
Eric Wait: Our visitors come in and use our scopes and it's kind of a one off for them. So how can we make data that they can use them at their Institute and maybe make something more cohesive for for people to use

41
00:06:10.440 --> 00:06:14.520
Eric Wait: With us with the event scopes and maybe with their scopes back in their, their home Institute.

42
00:06:17.700 --> 00:06:18.360
Josh Moore: Eric

43
00:06:20.010 --> 00:06:20.820
Erick Ratamero: That's me, I guess.

44
00:06:22.530 --> 00:06:32.250
Erick Ratamero: Yeah, my name is Eric Romero. I work at the Jackson Laboratory, together with the other Eric THAT JUST TALK LIKE TO TIME TO TWO PEOPLE GO.

45
00:06:32.700 --> 00:06:43.260
Erick Ratamero: I've been doing this by image analysis thing for the last three years or so, first in a facility contacts. Now in it kind of contexts.

46
00:06:43.770 --> 00:06:49.950
Erick Ratamero: And jack seems to be like a place with a very specific intersection of people with the capability.

47
00:06:50.700 --> 00:07:08.250
Erick Ratamero: The expertise and the interest and the bandwidth to actually work on questions like, what do we do in terms of file formats for the next generation. So we are interested and trying to take part on this this journey to what comes next.

48
00:07:11.160 --> 00:07:19.140
Erin Diel: Hi everyone, I'm Aaron deal. I'm an application specialist with Glencoe software. So just really interested to hear how everyone's been using the tools and learn about

49
00:07:19.380 --> 00:07:20.250
Erin Diel: What you want from them.

50
00:07:22.950 --> 00:07:26.400
Josh Moore: Never ready for how fast Aaron is so even

51
00:07:28.650 --> 00:07:39.570
Ilan Gold: Hi, my name is Ilan I work with Trevor, who I think will do some stuff later based on the order we're going in the work on this bizarre and basically bringing

52
00:07:40.710 --> 00:07:42.180
Ilan Gold: Your data to the to the web.

53
00:07:43.890 --> 00:07:58.890
Ilan Gold: And sort of its primary forms will use web beyond sort of similar theme to the party to give me like fast editing. So I guess my interest right now is mainly in making sure that you know things are ready for for viewing in the browser.

54
00:08:02.640 --> 00:08:03.090
Jamie

55
00:08:04.980 --> 00:08:12.420
Jamie Sherman: I worked at the Institute for Social Science kind of an engineer scientist role and basically my focus is on

56
00:08:13.230 --> 00:08:23.610
Jamie Sherman: Trying to eliminate barriers for us being able to share our data in a more open format in more open formats and click just getting it out of the proprietary formats that we're dealing with. And

57
00:08:24.390 --> 00:08:41.280
Jamie Sherman: Ideally, like we've been using quilt predominantly for sharing our data, but my major interest in the next gen formats is that it might actually allow us to put our data in the cloud and not have people have to download things because that just really doesn't work so well.

58
00:08:43.200 --> 00:08:44.130
Jamie Sherman: Anyway, thanks.

59
00:08:48.810 --> 00:08:49.260
John

60
00:08:51.750 --> 00:09:04.170
John Bogovic: My name is john book I'm part of this off of leverage and helium. I'm a contributor to image a in Fiji and I've been one of the bigger contributors to end five which is one of the newer block based file formats.

61
00:09:05.970 --> 00:09:18.180
John Bogovic: In this context, I'm mostly interested in, I think, as many tools as possible, be able to open the file formats or file formats that we come across come upon

62
00:09:19.170 --> 00:09:26.670
John Bogovic: Since it's frustrating to see people on the user end having to consistently and

63
00:09:27.180 --> 00:09:36.480
John Bogovic: Almost always receive data from one format to another in order to push it into different tools. So while I don't expect there will be one file format to rule them all.

64
00:09:36.990 --> 00:09:49.560
John Bogovic: Whatever we come up with, I would hope it would be easy for developers to write inputs and outputs right readers and writers for that's one of my invested vested interests. So that's all

65
00:09:49.620 --> 00:09:49.890
Thanks.

66
00:09:51.570 --> 00:09:51.870
Mark

67
00:09:55.470 --> 00:10:01.530
Mark Kittisopikul: Me I'm Marcus awful and currently as associate that a system engineer.

68
00:10:02.580 --> 00:10:10.230
Mark Kittisopikul: I'm mainly attached to the collab with moment. Um, so I kind of inherited the legacy of color. Walk format.

69
00:10:12.240 --> 00:10:16.500
Mark Kittisopikul: And currently fairly interested in how to receive

70
00:10:17.940 --> 00:10:20.220
Mark Kittisopikul: Store away data for legend across the

71
00:10:24.240 --> 00:10:24.660
Next,

72
00:10:25.800 --> 00:10:37.950
Nicholas Sofroniew: Next Friday. If I'm reading the emerging tech team, the chance, like a bag initiative, we're focused on trying to provide or improve access to reproducible quantitative by image analysis. I'm also on the steering Council.

73
00:10:38.310 --> 00:10:50.280
Nicholas Sofroniew: And Nicole contribute to the party project a multi dimensional image view of a Python and that we have a plugin infrastructure to support reading and writing files and I'd want to make sure that

74
00:10:51.270 --> 00:11:06.240
Nicholas Sofroniew: Any file format that you know were able to easily read and write both pixels and metadata, I think, you know, going beyond that. I'm probably also interested in representations that are

75
00:11:06.960 --> 00:11:24.450
Nicholas Sofroniew: Conducive to saving process data to not just roll pixels as well so you know segmentation masks. Even things you know that end up being much smaller in size but still need to be standardized around metadata, you know, shapes polygons, etc, etc. So that's just me.

76
00:11:29.370 --> 00:11:42.720
Nicolas Chiaruttini: Hello everybody. So my name is Nicole me I'm a microscope East and I perform also image analysis for users. You know microscopy facility at EPA fell in Switzerland.

77
00:11:43.590 --> 00:11:55.410
Nicolas Chiaruttini: And as such, we are using a lot of Fiji and you met Jay and so I'm on the Java side and our main issue is to be able to give to users.

78
00:11:56.550 --> 00:12:09.180
Nicolas Chiaruttini: unified way to open the data because we have microscope four monitors Nikon HR everybody all the stuff, and we hope that one day they will come up with a fight for much, which is good.

79
00:12:10.230 --> 00:12:18.420
Nicolas Chiaruttini: Which has a lot of meta data and, in particular, I'm interested into position on meta data. So when you combine different schools.

80
00:12:18.870 --> 00:12:40.440
Nicolas Chiaruttini: A bit like in correlative em, but also in microscope, you know, you have very big over of us, and then your sport or detail in some parts why I'm here. I'm really interested into combining also also in three dimensional space with different resolution. So that's my main focus. Cheers. Susanna

81
00:12:41.250 --> 00:12:47.970
Susanne Kunis: That's me. Hello, I'm Susanna coolness from us network from the university and also my book and

82
00:12:49.230 --> 00:13:12.750
Susanne Kunis: My focus is more meta data and capturing of meta data from Microsoft data and linked the data, but we working also on how we can share data and the base presentation that we in different facilities and so I'm happy to observe this discussion, how we can make this change. Thank you.

83
00:13:16.590 --> 00:13:28.230
Trevor Manz (he/him/his): Everyone I'm Trevor has been mentioned I were mostly a consumer of a lot of the hard work from the only community on these next generation file formats and trying to make sure that

84
00:13:29.490 --> 00:13:33.780
Trevor Manz (he/him/his): These cloud friendly format we can access in the in the web browser as well so

85
00:13:34.650 --> 00:13:49.500
Trevor Manz (he/him/his): I've been working on this project, which is sort of the web based viewer for these types of pixel data and then also GA S which is a JavaScript implementation star, which sort of factories intelligent things and also next generation software.

86
00:13:53.910 --> 00:13:54.270
Josh Moore: There we go.

87
00:13:56.190 --> 00:13:57.660
Ulrike Boehm: Sorry. Hello everyone.

88
00:13:58.710 --> 00:14:10.290
Ulrike Boehm: Again, I'm unfortunately right now in the lab. No, not unfortunately actually quite good that I can be in a lab. But anyway, so my name is already kaboom. I'm working at the advanced imaging center at omnia currently I'm with

89
00:14:11.190 --> 00:14:17.820
Ulrike Boehm: The multi focal instrument, but anyway. So at the beginning of the year, I was actually participating

90
00:14:18.540 --> 00:14:24.870
Ulrike Boehm: In a conference called software for microscopy was, which was mainly focused on acquisition acquisition software.

91
00:14:25.290 --> 00:14:34.380
Ulrike Boehm: But one of the major parts was there. Also the discussion about good data formats and as I can still recall from the discussion.

92
00:14:34.980 --> 00:14:46.080
Ulrike Boehm: People were already kind of envisioning the next type of data format and people were arguing about which data format is a good one. But what I think currently what we might need are

93
00:14:46.530 --> 00:14:55.020
Ulrike Boehm: Really, how can I say standards, also for the people who develop upcoming data formats to make sure that they actually fit into what's currently

94
00:14:56.010 --> 00:15:12.900
Ulrike Boehm: The norm, because I think Jason can maybe say they're a little bit more about but for example buyer formats. I think they adjust the they are software currently such that lots of data formats can be used. But every time when a new data form, it comes to the

95
00:15:13.950 --> 00:15:31.320
Ulrike Boehm: Data from. It comes to the market. This has to be changed again. And I think this is a hassle for lots of people in a thing. If I think we, if we have a standard and place it might be also easier to to integrate all of these and to make the life for everyone kind of a bit

96
00:15:32.520 --> 00:15:39.270
Ulrike Boehm: Easy. I just only briefly have to look at my notes if I forgot something. I am mainly because also

97
00:15:39.930 --> 00:15:40.830
Josh Moore: Just have more time.

98
00:15:41.730 --> 00:15:55.830
Ulrike Boehm: Okay. One thing was also very important. I mean, currently I'm also part of the corporate initiative, which is currently focused mainly on capturing standards for microscopes mainly hardware standards to guarantee that

99
00:15:56.880 --> 00:16:04.980
Ulrike Boehm: Everything is nicely in line. But to my mind hardware standards shouldn't be the only stuff. That's right. I think it would be actually be great if even

100
00:16:05.400 --> 00:16:18.120
Ulrike Boehm: This community here when we talk about data formats can be also kind of linked to also the other initiatives which were already kind of currently going on because I think right now there's lots of momentum in the community.

101
00:16:18.840 --> 00:16:26.220
Ulrike Boehm: And we really should kind of try to really get as make sure that as many voices as possible are being heard.

102
00:16:27.300 --> 00:16:28.650
Ulrike Boehm: And yeah.

103
00:16:28.710 --> 00:16:36.180
Josh Moore: I think we all agree with you. So actually, we should probably just take that recording and, you know, send that around to people, that's exactly what we all want to say were completely behind you.

104
00:16:36.750 --> 00:16:48.450
Josh Moore: Um, and maybe we can get a topic listed. If you know if there's more we can do about making that happen. We'll get back to it but just we had a couple of people who showed up. So getting through. It was Blair showed up and then Allah

105
00:16:50.250 --> 00:16:58.860
Blair Rossetti (Janelia): Hello, my name is players, Eddie. I work with Baraka, and with their weight at the AC and happy to see many familiar faces and hear what's going on.

106
00:17:01.620 --> 00:17:04.170
Ola Tarkowska: I am a lot. I work in sanker

107
00:17:05.190 --> 00:17:09.810
So my interest is basically in microscopy platform and coming

108
00:17:12.900 --> 00:17:18.450
Josh Moore: Cool, so will rush through the Dundee team. Plus, Melissa, Jason, you want to kick it off.

109
00:17:19.710 --> 00:17:31.500
Jason Swedlow: Sure. Hi everybody I'm receive any of you. Again, my name is Jason. Hello, I'm at the University of Dundee work with the army team. My role there is more or less to keep the money rolling in, and

110
00:17:32.610 --> 00:17:35.640
Jason Swedlow: Try to Yeah keep everything on track.

111
00:17:39.000 --> 00:17:39.570
Josh Moore: Do you go

112
00:17:42.150 --> 00:17:50.340
David Gault: Hi everybody. So I'm David Goltz have a developer on your meeting primarily working on bio formats and all native

113
00:17:54.390 --> 00:18:01.770
Petr Walczysko: Hi, my name is Peter I DO WITH YOU. ME TO mainly the outreach is and quality assurance.

114
00:18:03.930 --> 00:18:16.230
Sebastien Besson: And Sebastian, I'm working for with me to be mostly interested in the all everything from at related from by former to imitative and, more recently, heavily involved in the image data resource.

115
00:18:17.760 --> 00:18:31.200
Simon Li: I am certainly one of the mirror developers and also one of the main idea, our developers and sis admins. So I've seen it was so poor data that's coming into the idea that hopefully benefit from the performance discussion we're having now.

116
00:18:33.660 --> 00:18:40.170
Will Moore: I am will I'm mostly work on the web viewing side of a mirror.

117
00:18:42.270 --> 00:18:49.380
Will Moore: More recently just starting to look at how to show me the data and the clients in the web and in an apartment.

118
00:18:52.620 --> 00:19:00.480
Melissa Linkert: And I'm awesome. I worked at Bank of software and collaborate with Ami team and I work on pretty much anything you can think of that relates to file formats.

119
00:19:03.630 --> 00:19:18.000
Josh Moore: Cool, thank you everyone. So we're 28 people and we got done in 23 minutes. So on average, big golden stars. Um, so how do we want to proceed. So the

120
00:19:19.350 --> 00:19:23.280
Josh Moore: Working backwards. There's there. There are no new topics.

121
00:19:24.450 --> 00:19:34.110
Josh Moore: And there's a few people who have added interests. So this might be a good time for everyone to take a second and think about what they'd like to talk about, um,

122
00:19:35.250 --> 00:19:39.840
Josh Moore: I have my list of what everyone mentioned, so I'll probably bring those up, if no one else says anything

123
00:19:41.160 --> 00:19:50.100
Josh Moore: Um, and then I suggest we let everyone get a chance to ask questions about the couple of videos that were posted. So how many people actually watch the videos.

124
00:19:51.990 --> 00:19:58.320
Josh Moore: Oh, that's actually much better than the morning crew so you guys had more time. Awesome. So I'll assume there are a couple of questions on the videos.

125
00:20:00.450 --> 00:20:09.900
Josh Moore: And we'll get to that in just a second. But first, before anything else happens to anyone. Is anyone lost or has anything been mentioned that no one has an idea about what it is that we need to cover

126
00:20:10.770 --> 00:20:17.520
Josh Moore: Someone actually didn't need a favor this morning of asking what is in GFS and so we could actually start off with that. So it's, um,

127
00:20:21.930 --> 00:20:22.650
Josh Moore: Who unmuted.

128
00:20:25.170 --> 00:20:32.520
Josh Moore: Okay, so I'm going to assume everyone. Good. You know what's going on, you've watched the videos you're ready to dig into this and and hopefully make some pretty

129
00:20:33.840 --> 00:20:35.160
Josh Moore: Substantial decisions.

130
00:20:36.990 --> 00:20:43.080
Josh Moore: So they were cool keep out of your names. Okay, so there are the four videos so

131
00:20:44.850 --> 00:20:50.310
Josh Moore: We'll and Trevor did a fairly good job. I think of showing the state of the latest latest specs.

132
00:20:51.180 --> 00:21:00.060
Josh Moore: So, so, you know, there are three specs that we've worked on as kind of a community to have those have been posted to image SC and we're still looking for comments.

133
00:21:00.480 --> 00:21:09.660
Josh Moore: I'm in the morning session. We certainly talked about places where those need to be adjusted and then the most recent spec that we've been working on this for high content screening and that's what will show

134
00:21:11.640 --> 00:21:12.840
Josh Moore: Any immediate thoughts.

135
00:21:20.820 --> 00:21:31.350
Damir Sudar: Traditional one. One thing that struck me a little bit is that the respects, of course, tie all together. But there's a kind of a, an overarching statement that says

136
00:21:32.550 --> 00:21:42.420
Damir Sudar: What sub components are needed and and what set what what makes some of them more urgent or to go first.

137
00:21:42.510 --> 00:21:43.650
Damir Sudar: And we

138
00:21:44.520 --> 00:21:48.930
Damir Sudar: Know, kind of like a wish list and then a urgency or

139
00:21:51.000 --> 00:22:05.160
Damir Sudar: Needed needed soon annotation to each of those those things so that link and also as a community can add things that we think are important and others can chime in on. Yeah, yeah. I need that to or mad. I don't care.

140
00:22:06.240 --> 00:22:17.520
Josh Moore: Yeah, so it's something we've held off on to some extent is just defining a central repository for all the specifications and at the moment you you're more than welcome to open so or

141
00:22:18.450 --> 00:22:26.640
Josh Moore: The Tories link the document is currently putting yes open issue against any of those saying I need whatever

142
00:22:27.210 --> 00:22:31.680
Josh Moore: You know, that's where the the conversation certainly could take place, or it could take place on image se

143
00:22:32.280 --> 00:22:47.040
Josh Moore: Um, but the point is very taken. So certainly, the content from this morning. So there were, I think, three or four clear kind of request for a recommendation like specification of polygons, um,

144
00:22:48.090 --> 00:22:49.680
Josh Moore: I guess I can just go, look what they were.

145
00:22:52.380 --> 00:23:00.900
Josh Moore: Gail and origin information on the multi scales that we we put on hold, from the multi scale specification, all of that.

146
00:23:01.860 --> 00:23:06.210
Josh Moore: Compression. All of that will need to show up as individual issues and so

147
00:23:06.840 --> 00:23:19.770
Josh Moore: I'm part of what we'll do, at the very end of this is just talk about how do we keep having these conversations, you know, is it all on GitHub. Is it all a repository. Are we doing it all on image SC, when are we having meetings, things like oh

148
00:23:22.470 --> 00:23:36.690
Josh Moore: As I said in my video certainly throughout the summer, we kind of heads down and just focused on getting a couple of things done. But now with this meeting. It's all about finding out ways to keep the conversation going and make sure that everyone can can get themselves heard

149
00:23:38.460 --> 00:23:53.010
Josh Moore: The flip side of that, though, is it will almost certainly be necessary if we have the community raising issues saying this is vital that we also have more people implementing those issues. So that'll be what I'm requesting right so just be ready for that.

150
00:23:54.900 --> 00:23:55.470
Josh Moore: Anyone else

151
00:24:04.620 --> 00:24:09.990
Josh Moore: Okay so everyone's good on the state of the videos. Everyone give it a try and download anything

152
00:24:13.110 --> 00:24:19.350
Josh Moore: Yeah, we thought about doing the whole live demo thing that's always it's always a question. Um,

153
00:24:21.240 --> 00:24:22.950
Josh Moore: Okay, so in addition to the things

154
00:24:24.330 --> 00:24:34.380
Josh Moore: I can't tell if everyone's happy or you're just quiet, um, for me it sounds like the two biggest things that we have to probably address. And it was quite different from this morning.

155
00:24:37.140 --> 00:24:46.080
Josh Moore: Our basically comes of the software from microscopy paper. So it's actually something I've been wanting to engage with for a while, so I don't know who all was there.

156
00:24:46.620 --> 00:24:53.790
Josh Moore: Um, but I can certainly Express. I have a very different reading of of how this is going to work. And I think the Geneva your crew.

157
00:24:55.350 --> 00:25:11.550
Josh Moore: Are certainly talking to private ish after the meeting chef on private, um, there's this idea that we can do everything with an API. And I think that's one of the big decisions we have to make is, is it an API or is it a file format, um,

158
00:25:12.870 --> 00:25:16.050
Josh Moore: So that's certainly something we can dig into if anyone's interested

159
00:25:18.720 --> 00:25:25.350
Josh Moore: The process data. So can talk about the state of the little images as they stand. One of the requests from this morning.

160
00:25:25.800 --> 00:25:35.220
Josh Moore: Was what how to go about adding metadata on to the current specifications, you know, how do you add the class name for each of the the labels in an image.

161
00:25:35.700 --> 00:25:54.360
Josh Moore: And that kind of went even further in terms of can we add you know if you have vertices that are stored in the in the format. Can you add information to each of the vertices basically at the highest level, the most generic request is can we have tabular data in the format, um,

162
00:25:55.890 --> 00:26:08.040
Josh Moore: And eventually we will be able to, there's actually a problem, supporting it in the same way between czar and in five. So that's something we can always use help with making sure that we shouldn't in the same way.

163
00:26:12.030 --> 00:26:16.950
Josh Moore: And then there's the larger meta data questions which I guess is listed here. Okay.

164
00:26:18.870 --> 00:26:20.070
Josh Moore: I mean, it looks like at the moment.

165
00:26:22.200 --> 00:26:24.810
Josh Moore: It's either multi scale or copying data.

166
00:26:27.870 --> 00:26:31.290
Josh Moore: I think copying data can kind of

167
00:26:32.520 --> 00:26:34.050
Josh Moore: Way, first we want to start there.

168
00:26:36.000 --> 00:26:37.020
Josh Moore: Mark, you want to say something.

169
00:26:37.860 --> 00:26:42.870
Mark Kittisopikul: Sure. So I think the broader question. Well, the maybe the

170
00:26:44.400 --> 00:26:55.230
Mark Kittisopikul: impetus for for that copying data is I think a lot of us are acquiring raw data from microscopes, or various devices and we have to do with somewhere.

171
00:26:55.770 --> 00:27:10.050
Mark Kittisopikul: And then we have to start analyzing or processing it. And so this is starting to become a really big problem is one usually coming up from microscope, you got it in one format either defined by the control software or by the camera manufacturer

172
00:27:11.730 --> 00:27:30.900
Mark Kittisopikul: Or, you know, however, someone brings it together. And then there's a need to either reformat it into something like we're talking about now or or maybe move it somewhere where it's more accessible. And so this is starting to become increasingly type consuming process that both inhibits

173
00:27:31.050 --> 00:27:32.700
Mark Kittisopikul: Us of the instrument, but

174
00:27:33.750 --> 00:27:41.580
Mark Kittisopikul: Also, it just makes it a challenge in terms of being able to get it to the people who need it.

175
00:27:45.510 --> 00:27:55.350
Mark Kittisopikul: And so I think maybe one one caution about kind of coming up with a new format is, you know, now we're adding get one more format.

176
00:27:56.460 --> 00:28:00.060
Mark Kittisopikul: To the whole list of things and making sure that we're not kind of

177
00:28:00.150 --> 00:28:00.690
Josh Moore: Making it

178
00:28:00.720 --> 00:28:03.600
Mark Kittisopikul: more onerous by requiring yet another copy of that do

179
00:28:07.320 --> 00:28:08.310
Josh Moore: You want to add on to that.

180
00:28:09.480 --> 00:28:18.690
Davis Bennett: I mean, the issue that's particular to the trunk formats is that if you have any overhead associated with moving one thing on the file system to another as

181
00:28:19.410 --> 00:28:26.580
Davis Bennett: Soon as you proliferate, the number of things on the file system through Chungking that starts to be an issue that wasn't an issue for tip for HDFS

182
00:28:29.070 --> 00:28:29.400
Davis Bennett: Oh,

183
00:28:30.420 --> 00:28:43.620
Davis Bennett: My experience has been. I mean, I'm, I'm using one like czar. By default, puts all the chunks in a single folder. If you try to open that folder and as a million chunks. You can go get a coffee or something. Is this good

184
00:28:45.540 --> 00:28:48.000
Davis Bennett: Oh czar also supports the nested directory store.

185
00:28:50.460 --> 00:29:02.220
Davis Bennett: And if you want to copy a bunch of things then parallelism is your friend. Both not necessarily obvious to people who are used to a format that doesn't fill the file system but little things.

186
00:29:03.510 --> 00:29:11.250
Davis Bennett: And then another perspective is, if you want to sacrifice. This could be an extension to end fibers are if you're willing to sacrifice atomic rights.

187
00:29:11.790 --> 00:29:20.790
Davis Bennett: Then you could fuse trunks together and make a version of a data set. And this is essentially what neuro Glanzer the web based volume viewer does

188
00:29:21.630 --> 00:29:32.070
Davis Bennett: Dietary format data are stored in chunks. But then there's a compaction step the trunks are there and you have 10 files, each of which is pretty big. But it's easy to copy

189
00:29:33.660 --> 00:29:35.490
Davis Bennett: All of these, sorry.

190
00:29:36.630 --> 00:29:46.080
Eric Perlman: As a yeah Trevor has done some very cool work on this plane with. Oh, I mean, typically find more optimal formats to use, but it's been cool to follow. I think you've been on those sites as well.

191
00:29:48.630 --> 00:30:00.060
Davis Bennett: The I think all of this falls under the new territory and people maybe need to learn new tricks like for me, deleting big containers took a long time so I wrote some code to paralyze it and now it's not an issue.

192
00:30:02.430 --> 00:30:07.680
Davis Bennett: Modern computers have enough cores that I think that just some coding can get around some of these issues.

193
00:30:10.650 --> 00:30:14.610
Bill Katz: It seems like one of the things that this kind of brings up

194
00:30:15.150 --> 00:30:27.750
Bill Katz: Is the notion that with the single you know chump profile versus, you know, the Uncharted versus the shorted representation for example in neuro Glanzer and also immutable versus mutable.

195
00:30:28.350 --> 00:30:36.450
Bill Katz: Because I think those distinctions mean a lot in terms of share ability. How much data needs to be transformed.

196
00:30:37.980 --> 00:30:45.060
Bill Katz: And also raises the question to of API versus file format because if you're doing an API.

197
00:30:45.510 --> 00:30:59.280
Bill Katz: And data engine. You can do a lot of transformation before you actually get the data, whereas if you're assuming that you're reaching directly into a chunk and need to interpret it immediately the file format.

198
00:30:59.880 --> 00:31:08.490
Bill Katz: Then um you know that that limits your options in adding things like transactions burgeoning etc.

199
00:31:15.360 --> 00:31:29.520
Blair Rossetti (Janelia): To Davis this point as well. A we're always keeping the user perspective in mind because while it might be easier for us to write some code to delete this stuff in parallel visitors undoubtedly will just try and click and drag stuff.

200
00:31:31.350 --> 00:31:37.500
Blair Rossetti (Janelia): And anything that slows them down is going to be a headache for us because they'll complain about the past and future.

201
00:31:38.550 --> 00:31:47.490
Blair Rossetti (Janelia): So I think mechanisms that are non programmer friendly are going to be critical, at least from our niche.

202
00:31:51.840 --> 00:31:59.610
Jamie Sherman: From my perspective, and I come from a bit of an outside space and from a mass spec background. Originally, is that

203
00:32:00.450 --> 00:32:10.470
Jamie Sherman: Like the formats that are created by the instrument manufacturers are entirely designed around dumping data to disk as fast as possible because acquisition speed as a selling point

204
00:32:11.190 --> 00:32:22.260
Jamie Sherman: And so they're never going to be optimized for viewing and they never liked having worked for a mass spec campaign. They never spend. They're not interested in spending time

205
00:32:22.890 --> 00:32:37.620
Jamie Sherman: Optimizing a viewer for you eat like a file format for using it. It's purely for acquisition. But so, like, so I think that there is a valid role for creating quote unquote yet another file format.

206
00:32:39.960 --> 00:32:49.020
Jamie Sherman: It, assuming it gives us a user and that we want in from my like the one thing I would love to see or what gets me excited about czar and these other formats.

207
00:32:50.280 --> 00:32:51.450
Jamie Sherman: That are chunked is that

208
00:32:52.470 --> 00:32:58.680
Jamie Sherman: I really liked the idea of being able to put the data like we're trying to share data right and we're trying to share it in an open format.

209
00:32:59.580 --> 00:33:18.360
Jamie Sherman: And if we can put it on, you know, Google or AWS or whatever cloud server and let people actually browse the data in that context that that's incredibly valuable because currently we're already sharing quite a bit of a fair few terabytes of data and

210
00:33:20.280 --> 00:33:29.340
Jamie Sherman: And the challenge we have is that nobody's going to go there and download a couple terabytes of data to browse through it because you know storage things, etc.

211
00:33:30.420 --> 00:33:44.520
Jamie Sherman: And if the if the file format. Sorry part dynamic and we can use you know FF FSS back or whatever to pull the data from, you know, and interact with it through a browser, like a party or whatever.

212
00:33:45.720 --> 00:33:59.970
Jamie Sherman: Then that makes the data, much more usable. It makes it discoverable from experimental his perspective, I think we're. It sounds like there are two kind of target markets here one that's kind of the website and the other. That's more of the

213
00:34:02.370 --> 00:34:06.870
Jamie Sherman: Like the user in a lab who has who's just dealing with the file as an entity.

214
00:34:08.070 --> 00:34:13.830
Jamie Sherman: And so I guess breaking things into those perspectives might be helpful. Anyway, thanks for listening.

215
00:34:15.570 --> 00:34:17.370
John Bogovic: But I have some related points.

216
00:34:20.490 --> 00:34:27.900
John Bogovic: I think, I think those are great points. And related to that, um, my feeling is that

217
00:34:28.920 --> 00:34:42.060
John Bogovic: Hot wallet copying data is important. I think I would like to, I don't know if educators, the right word, but I think users should avoid copying just copying data as much as possible.

218
00:34:42.570 --> 00:34:46.500
John Bogovic: And especially when data gets large enough

219
00:34:47.100 --> 00:34:57.930
John Bogovic: And to that end, I guess I would hope that there is a regime in which these junk file formats just shouldn't be used, that is small to medium sized data just shouldn't you shouldn't have a million chunks of a

220
00:34:58.350 --> 00:35:09.630
John Bogovic: Data set that is small enough, whatever that means. So, I mean, this is I think truths are and probably true been five that there might be a regime in which

221
00:35:10.560 --> 00:35:22.980
John Bogovic: It's chunked into just one chunk and that's it. Is that reasonable. I'm curious what people think about that idea and that is chunk data sets should have a fallback to an unranked

222
00:35:24.030 --> 00:35:25.890
John Bogovic: version that is still viable.

223
00:35:28.710 --> 00:35:36.120
Ilan Gold: I just want to comment that I something I ran into maybe a couple months ago when we were we, I work on the hub map portal and something

224
00:35:36.570 --> 00:35:44.760
Ilan Gold: I ran into when developing our tools. Was that one of the things you can't do a czar that you can do with like a single files. You can't just like download it in your browser. Are you you need

225
00:35:45.060 --> 00:35:51.120
Ilan Gold: Need like utilities to get at that folder structure, I think. I don't think you can kind of just W get as our store.

226
00:35:51.450 --> 00:35:59.130
Ilan Gold: Unless it's, I think there's maybe like a way to do it. But I don't think that, as far as I understand there's not like a way to just W. Get it like you need a command line utility

227
00:35:59.430 --> 00:36:09.000
Ilan Gold: For a cloud provider GS you till or AWS S3 CP so like that. That's maybe another another case where like having both file formats stored be useful.

228
00:36:09.390 --> 00:36:15.510
Davis Bennett: So we'd have to be, I think, fundamental issue is that the metadata is stored in a separate object from the data.

229
00:36:16.590 --> 00:36:20.430
Davis Bennett: Unless you sip it together. There's not one thing you could get that has the whole story.

230
00:36:20.760 --> 00:36:23.130
Ilan Gold: Right, so I brought this up specifically because you said like, oh,

231
00:36:23.160 --> 00:36:34.320
Ilan Gold: You know, it's, it's obviously like possible to do this, but to non computer like if it requires like a little bit of computer science here to be able to like download this utility or use the utility, whereas some users, maybe don't want that. So,

232
00:36:36.750 --> 00:36:39.390
Josh Moore: So I have a couple of thoughts, they're not

233
00:36:41.130 --> 00:36:53.880
Josh Moore: Exactly well structured on this from the only me side I'm certainly and what we did just in the experiments we've done so far with SAR um it's a huge benefit to be able to say

234
00:36:55.290 --> 00:37:07.860
Josh Moore: This thing is is a morphic wherever we put it. So we have the images locally. We work with it one way, the code works identically to if it's remote obviously their performance impacts, but um

235
00:37:08.880 --> 00:37:18.030
Josh Moore: And I actually think with this specification. Well, I think a goal of any specifications, we come up with should also then apply to other formats. So

236
00:37:18.450 --> 00:37:29.640
Josh Moore: You know, in the regime where you would rather have one file use HTML5, you know, that's fine. It's, it's, it's a good format and it you know it has its limits. But all of these things do. It's a trade off.

237
00:37:30.240 --> 00:37:41.100
Josh Moore: Um, I think it would be so it, it's still useful for us to have these conversations about these various conventions, like other communities have done so I'm thinking of the geospatial community where

238
00:37:41.490 --> 00:37:57.360
Josh Moore: There's an entire process where they go through saying here's how we grow and how we're going to represent maps and they've gone through exactly this process. There was net CDF for many years and their data set and that CDF which is HTML5 and now they're moving to czar and free

239
00:37:58.410 --> 00:38:10.170
Josh Moore: Own. Um, and now you know the underlying that CDF library is being taught how to work with czar. So, the two are completely perfect so you know that you're getting everything out of a piece of data.

240
00:38:10.980 --> 00:38:20.700
Josh Moore: Regardless of what store. It's sitting in you know so i i think there's an abstraction here that we can we can use to stop worrying, a little bit about

241
00:38:21.240 --> 00:38:26.370
Josh Moore: The specifics. You know, we want to enable for the user that they can choose one or the other, um,

242
00:38:27.000 --> 00:38:37.560
Josh Moore: But if it's but if you're sure. So what's missing at the moment from bio format is you don't know this, you know, we don't know from the proprietary file formats. If we have absolutely everything is it structured properly.

243
00:38:37.980 --> 00:38:43.470
Josh Moore: But if we've made these specifications together as a community, we can say we want to do this in HDFS

244
00:38:44.190 --> 00:38:55.020
Josh Moore: We could choose a sub sub format of each of HDFS we could choose PTV or Mrs and say we're going to do the single file you case in this container.

245
00:38:55.560 --> 00:39:10.980
Josh Moore: For the cloud use case, you know, we're working to have czar and in five, the same underlying specification. Okay, that needs to be some somehow unified and then there will be one cloud based, you know, small chunk based format and it's all the same.

246
00:39:12.060 --> 00:39:22.020
Josh Moore: So I think, um, it's an involved way of saying, you know, try to keep the number of formats as small as possible, which is what a lot of you mentioned and you know in your intro so

247
00:39:23.010 --> 00:39:34.620
Josh Moore: Maybe it's a question about how to go about doing that if I roll back even farther and start to think about the API versus file format. I guess I have. And maybe it's a naive.

248
00:39:35.310 --> 00:39:52.890
Josh Moore: Hope that we can we can have a format that that is this is amorphous and that we can put anywhere we can store it in any way. And then we start to build up or they expend extensions to the specification that allow us to do some of the more complex API like

249
00:39:54.510 --> 00:40:00.300
Josh Moore: indexing of the data, you know, there are things that are beyond what a file format can do. Um,

250
00:40:00.960 --> 00:40:14.340
Josh Moore: I guess just if if I go to a service. I would like that service to still provide me the base underlying specifications that we've all agreed on, you know, in the, in the simplest way to get it now i w get the entire data.

251
00:40:15.840 --> 00:40:24.000
Josh Moore: But if you're on top of it, by all means, I, why not, right, it may be that some some implementations don't

252
00:40:25.080 --> 00:40:34.710
Josh Moore: Support all of the indexing and I think this is probably a lot. I mean, I don't know much about neuro glance right basically know what Eric has taught me. Um, but I think

253
00:40:35.940 --> 00:40:46.590
Josh Moore: You know, I think all of its functionality can be basically considered an extension to the spec that some communities may need and then we kind of get the best of both worlds. Okay, that was my spiel.

254
00:40:51.390 --> 00:40:51.750
I mean,

255
00:40:57.510 --> 00:41:08.520
Davis Bennett: I would just like I'm used to. I was used to using image VIEWERS LIKE Fiji or image j, then, as a user, you need to think about the file format of your data.

256
00:41:09.480 --> 00:41:20.820
Davis Bennett: Because you need to know what's in the file system. You need to know that can image a read my TIFF file is Jay for a while 64 bit floats for a while. So this was always a headache for me.

257
00:41:22.650 --> 00:41:41.550
Davis Bennett: But now that I'm working a lot with neuro Glanzer it's extremely refreshing then answer has no idea what format format my data is in. And I think a lot of users would actually prefer to be in that situation and never need to care what file format, the data is actually stored

258
00:41:42.630 --> 00:41:50.280
Davis Bennett: They just know where is the data and can my tool stand what data is at that place.

259
00:41:51.480 --> 00:42:00.630
Davis Bennett: So, this I think my experience working with image as an API or image as a service has been it's felt way nicer than image as a form

260
00:42:02.010 --> 00:42:10.560
Davis Bennett: Oh, if the visualization tools will approach. But if Fiji, for example, would stop having an opinion about, you know, different tips.

261
00:42:10.980 --> 00:42:24.750
Davis Bennett: Would delegate actually handling the file format to some other service and would just focus on rendering images that it gets from who knows where I think a lot of these problems spike obviously pushed maybe somewhere else.

262
00:42:27.210 --> 00:42:27.540
Davis Bennett: I

263
00:42:28.380 --> 00:42:30.600
Josh Moore: Still think some of the conversations we're having. We're still gonna

264
00:42:31.560 --> 00:42:33.330
Josh Moore: We're still going to need to have, but

265
00:42:34.800 --> 00:42:35.940
Josh Moore: I don't disagree with you at all.

266
00:42:37.980 --> 00:42:38.850
Josh Moore: Veteran and you have your hand up.

267
00:42:43.560 --> 00:42:45.210
Caterina Strambio: Yes, primary a question.

268
00:42:46.560 --> 00:42:52.290
Caterina Strambio: So the idea is you know I think minimal number of formats. I'm

269
00:42:54.030 --> 00:43:05.220
Caterina Strambio: Definitely obviously very much on board. I'm just thinking, I don't know if we have addressed this issue or if we need to address it here. How do we get from

270
00:43:05.850 --> 00:43:27.300
Caterina Strambio: Here to there because I mean we we know that the manufacturers of the soft of the microscopes and the software that goes with that microscope will not. We want to continue to produce their file formats, so that they in maximizes as as Jamie was saying.

271
00:43:28.560 --> 00:43:29.820
Caterina Strambio: Writing on disk.

272
00:43:32.430 --> 00:43:34.560
Caterina Strambio: It is the idea that we're just gonna

273
00:43:35.970 --> 00:43:48.990
Caterina Strambio: Come up with a great solution. We're going to ask them to participate, a conversation and then present them with a fit with, you know, like a fait accompli like this is what we want to do and come up with it or

274
00:43:50.010 --> 00:44:01.560
Caterina Strambio: Which I mean I don't have any anything in principle against it. It's just, I'm just thinking, what are we, do we need to talk about how we're going to get there in terms of what manufacturers will do

275
00:44:02.250 --> 00:44:07.590
Ulrike Boehm: Think they should be invited as well, right, to my mind, like everyone who kind of works to

276
00:44:08.250 --> 00:44:20.430
Ulrike Boehm: In a slide way on data formats. Also, the developers of current data formats like n five and czar and so on and so forth. They need to be part of this discussion as well because they are actually the people who

277
00:44:20.910 --> 00:44:29.460
Ulrike Boehm: Also know bit more about the visibility whether this type of data formidable how it has has been created is as flexible to fulfill the demands right

278
00:44:30.330 --> 00:44:32.040
Caterina Strambio: Well, the developers of the

279
00:44:32.700 --> 00:44:38.790
Caterina Strambio: ends are and stop their part of it. The question is more like the manufacturer. So just clubs. Anyway, I'll shut up.

280
00:44:41.520 --> 00:44:52.830
Davis Bennett: Is it the many years, at least for light microscopes, they've had the opportunity for years to at least be on a standard between manufacturers. So, as far as I know, it's ice and Olympus.

281
00:44:53.940 --> 00:45:00.300
Davis Bennett: I'll use different formats. So they actually cared about providing something that would be convenient for users, they would have done it.

282
00:45:03.450 --> 00:45:04.860
Josh Moore: So I guess I want to give them the

283
00:45:04.890 --> 00:45:07.950
Josh Moore: Benefit of the doubt. So we've had multiple conversations

284
00:45:08.370 --> 00:45:17.280
Josh Moore: So Stephen can relate Wagner from Zeiss like no Conrad was on the session this morning, you can kind of see his statements, um,

285
00:45:18.720 --> 00:45:27.480
Josh Moore: There were, you know, a handful of vendors five or six at the European bio imaging industry board meeting, I guess, at the end of 2018

286
00:45:28.140 --> 00:45:39.720
Josh Moore: So it's been better than ever before talking to vendors and hearing them say, okay, we could see possibly having a developer work with you guys to develop something we could see

287
00:45:40.710 --> 00:45:50.730
Josh Moore: And some of the statements are well if there's a good C or c++ library then ok we'll, we'll consider where you know we're we're making baby steps towards this. And I guess that's what Catarina is saying as well.

288
00:45:52.080 --> 00:45:58.980
Josh Moore: You know, from our point of view, what, what we can see doing is just trying to get to the point where there's a solid

289
00:45:59.880 --> 00:46:10.020
Josh Moore: Specification solid libraries arm and hope to sell it to them, you know, that's, I don't know anything else to do. It sounds like there's additionally

290
00:46:10.980 --> 00:46:22.290
Josh Moore: The API conversation on top of that, which I think is still interesting I don't exactly know what the vendors will hell, they will see their interaction with the API as opposed to a file format, you know,

291
00:46:22.890 --> 00:46:30.240
Josh Moore: I don't disagree with what you guys are saying, you know, we're very, very file format based and just we've been forced to be for for many years.

292
00:46:31.500 --> 00:46:43.290
Josh Moore: Maybe there's a way to free ourselves to look even higher for a better solution. I still think it will revolve around good implementations good specifications and lots of conversations

293
00:46:45.120 --> 00:46:54.570
Josh Moore: You know, I just assume if we start to look at API's. For example, there will be, you know, there will be things that other communities are going to need to add into that API. And it to the extent that that's doable.

294
00:46:54.780 --> 00:47:09.630
Josh Moore: You know, all on board, but it's still the same process and and maybe we redefine what n g f f is, but it's still the same process of this community and as many other communities as want to partake participate completely agree with Rica, you know, everyone should be on board.

295
00:47:11.310 --> 00:47:21.180
Josh Moore: Of making sure you know we have a process to go if we don't agree if the API or the file format is slightly different in two places. How do we get them unified

296
00:47:22.800 --> 00:47:35.100
Josh Moore: Basically I, you know, Mia culpa I've mostly focused on doing that with czar and in five to date. You know we got money to get someone to work on it. They're working on it, it should be done by the end of the year. I'll cross my fingers.

297
00:47:36.330 --> 00:47:49.950
Josh Moore: And then we move on to the next problem, right. So if any of you know, you know, answering or trying to answer Catarina first question, if you know the next thing we really need to tackle to make this happen. You know, let's do it. That's what this meeting is all about.

298
00:47:50.550 --> 00:47:50.790
Bill Katz: I think

299
00:47:50.820 --> 00:47:52.020
Josh Moore: All of one, sorry. Go ahead.

300
00:47:52.950 --> 00:47:55.650
Bill Katz: So from, from my perspective, it seems like

301
00:47:57.000 --> 00:48:04.890
Bill Katz: You know, I see a lot of the discussion is primarily focused around dense data that we kind of like that makes a lot of sense.

302
00:48:05.310 --> 00:48:18.570
Bill Katz: Because 2D, 3D and 2D image data will be dense and so when you're doing physical partitioning to do like in five and SAR that also makes sense because each of the files are relatively similar size.

303
00:48:19.080 --> 00:48:28.410
Bill Katz: One of the things that I really liked about tile DB, for example, is that they completely differentiate between the physical partitioning and the case for

304
00:48:28.740 --> 00:48:29.640
Bill Katz: sparse data.

305
00:48:30.180 --> 00:48:40.410
Bill Katz: For things that you don't actually you can't easily physically partition, because then you, you want to do it partitioning more based on capacity.

306
00:48:41.400 --> 00:48:53.310
Bill Katz: And so I think one of the issues that I see with trying to standardize a file format is you have to know if the file format would be suitable for the different kinds of use cases.

307
00:48:53.790 --> 00:49:04.440
Bill Katz: And it's not clear to me, for example, that you could get a file format that supports both dense and sparse. I believe in the tile DB case, for example, they have two different file.

308
00:49:05.220 --> 00:49:18.990
Bill Katz: formats for those two different cases. And I think that they're trying to create a universal data format and they think that the dense and the sparse case handles everything so that they can build data frames on top of it, etc.

309
00:49:19.440 --> 00:49:28.920
Bill Katz: But I think one way to approach this might be saying, Okay, what is the minimal set of sort of characteristics of data that we need to support

310
00:49:29.640 --> 00:49:35.070
Bill Katz: And what are the file formats that we believe you know at least suitable for that.

311
00:49:35.700 --> 00:49:49.890
Bill Katz: Because it's not clear to me, for example of you going in five czar route, whether that's even suitable for example, doing sparse volumes, where you're just interested in a few neurons, you know, that are spanning massive amounts of spaces.

312
00:49:51.090 --> 00:49:56.250
Bill Katz: Or if you're doing meshes or other things that. So it really depends on the type of data.

313
00:49:57.360 --> 00:50:04.440
Josh Moore: So agreed. So just as a side note that we have had conversations with tile DB and actually just from aerosol

314
00:50:05.130 --> 00:50:22.020
Josh Moore: Go outside of having some base implement a SAS based format that everyone's using and then building their feet on top of it. So I would hope we could even work with tile dB. I know their, their implementation is a very strong one. So, okay, I'm make use of that and and tie everything together.

315
00:50:22.440 --> 00:50:35.100
Bill Katz: I guess I'll also put a disclaimer here that I'm starting to collaborate with the founder of tile DB on adding branch version to their system, which I think is natural, given the way that they handle fragments and stuff like that.

316
00:50:39.120 --> 00:50:40.560
Josh Moore: So I think I'm good. Go for it.

317
00:50:41.460 --> 00:50:56.160
Ola Tarkowska: But basically I definitely agree with you build that we use cases are crucial to define because I, as I said, on the chat. I found my needed optimization in the file format.

318
00:50:57.420 --> 00:51:09.450
Ola Tarkowska: needs to happen on the process level and it's up optimization is happening for each individual process. So whether it's visualization analysis registration or

319
00:51:10.080 --> 00:51:28.200
Ola Tarkowska: archiving, they require very different data. It's the same pixel data, but different formats are needed to process this the most efficient way. And if we dig more into those processes they they just require different solutions so

320
00:51:29.220 --> 00:51:35.670
Bill Katz: I think we can see that very easily to if you're supporting visualization Apple applications. Right.

321
00:51:35.970 --> 00:51:46.320
Bill Katz: Because the style that it's written in. You almost want it to be on GPU available. So like for example neural Glanzer

322
00:51:46.710 --> 00:51:58.290
Bill Katz: Has as part of their format that he designed a compression scheme for segmentation such that it can be easily loaded, and then you know be altered using shader code or whatnot.

323
00:51:59.580 --> 00:52:08.970
Bill Katz: And so I think that it's almost like in the database world. You can almost see this d normalization approach right that you'd like to have a normalized form of the data.

324
00:52:09.300 --> 00:52:18.150
Bill Katz: But it just is not suitable for a variety of use cases where you're going to want to D normalize the data so that it can be rapidly done and so

325
00:52:19.440 --> 00:52:26.340
Bill Katz: I think that's another aspect because like we're also seeing the the graphics card companies are starting to support things like direct storage.

326
00:52:26.730 --> 00:52:38.190
Bill Katz: Where you can move the data directly from the whatever store you have into the GPU. Um, and so, yeah, these are all things that are sort of part of it.

327
00:52:44.220 --> 00:52:44.730
Josh Moore: I think I missed

328
00:52:44.880 --> 00:52:45.810
Nicolas Chiaruttini: Some other hands.

329
00:52:45.870 --> 00:52:46.350
Josh Moore: Go for it.

330
00:52:46.860 --> 00:52:59.520
Nicolas Chiaruttini: If I can come on. Just quickly on the political side towards the Vandal, I would just like to mention that seeing the microscope each facility. I've also emerging facility.

331
00:53:00.150 --> 00:53:09.780
Nicolas Chiaruttini: They need to have a common workflow for all the Vandals to get something which is not 50 different process to to do the same segmentation.

332
00:53:10.200 --> 00:53:22.260
Nicolas Chiaruttini: And basically in the code for tender. You can we SPS explicitly specify, you want to have the bio format API supported in beauty resolution and so on.

333
00:53:23.220 --> 00:53:33.840
Nicolas Chiaruttini: And this is an argument also to push the Vandals to go to wild something unified so if there is a good specific age good specification, then you can push

334
00:53:34.740 --> 00:53:52.410
Nicolas Chiaruttini: For this other microscopy facility level. So we did this with the microscope. Unfortunately, we did not specify mercury resolution, then we end up with a better format API without mercy reservation and I'm pushing a lot of the people from later.

335
00:53:53.490 --> 00:54:01.710
Nicolas Chiaruttini: Not to mention them that I hope will work to watch this. But just to mention that at the microscope P facilitator, you cannot really

336
00:54:02.430 --> 00:54:12.960
Nicolas Chiaruttini: An impact because when you buy an equipment. If the Vandals see on the court photos and you need to support something at some point I simply they will do it.

337
00:54:20.910 --> 00:54:21.720
Jason Swedlow: A COUPLE COMMENTS.

338
00:54:22.920 --> 00:54:25.050
Jason Swedlow: So say a couple things just from the standpoint

339
00:54:25.050 --> 00:54:26.250
Jason Swedlow: Of our experience.

340
00:54:28.560 --> 00:54:38.220
Jason Swedlow: With respect to the vendors. I think Josh is correct. There's an awful lot of interest. It's fair to say Zeiss has been the most present, but we know

341
00:54:38.670 --> 00:54:46.020
Jason Swedlow: We're getting a lot of attention from Nikon and Olympus, as well as, like, as I've had a lot of conversations high level conversations with like

342
00:54:48.210 --> 00:54:50.850
Jason Swedlow: Probably just to summarize, an awful lot.

343
00:54:52.440 --> 00:54:56.490
Jason Swedlow: This is going to be this is going to be a long process to work through with them and it's going to be

344
00:54:56.940 --> 00:55:05.820
Jason Swedlow: I think just parroting Joshua, it's going to be a lot of talking, but it will be based on what they just said, which is, you know, very strong specifications and if those exist.

345
00:55:06.270 --> 00:55:16.710
Jason Swedlow: They will pay attention and they will adopt them proof of that is we've had anecdotally been anecdotally told several times from settlement manufacturers, they tell the customers to use file formats.

346
00:55:17.100 --> 00:55:22.770
Jason Swedlow: Right, so that's that's kind of an example of that. So that's, that's, that's the first thing that's possibly

347
00:55:23.340 --> 00:55:29.130
Jason Swedlow: A little bit pathologically optimistic. I'm usually accused of being that, that's my role in the project. But, you know,

348
00:55:29.940 --> 00:55:38.010
Jason Swedlow: That put all of that won't happen in the next week. I think the other thing we're seeing is a single format just won't do it and to pretend that

349
00:55:39.000 --> 00:55:41.790
Jason Swedlow: All of the use cases that we're talking about. We're covering

350
00:55:42.750 --> 00:55:54.960
Jason Swedlow: First of all, the modalities that we're talking about the kinds of visualization that bill is thinking about with respect to, you know, a whole brain and I will actually want to focus on two different neurons that are separated by millimeters

351
00:55:56.070 --> 00:56:06.600
Jason Swedlow: public repositories, etc. I mean there's just extremely, extremely different needs and use cases. So the pretense that oh, there will be a single thing out there, it will solve all the world's problems.

352
00:56:06.990 --> 00:56:18.840
Jason Swedlow: Is I think needs to be put to bed. I will tell you, however, that as I kind of slightly jokingly mentioned, you know, I end up having to pay the bills around here, at least for me.

353
00:56:19.620 --> 00:56:24.180
Jason Swedlow: I was on a two hour call with the thunder. And so if you say to them.

354
00:56:24.720 --> 00:56:36.780
Jason Swedlow: If they're if the call is about interoperability and data access and open data and they say, Wait, wait, wait, you're going to solve this by coming up with yet another file format. Actually, no, not just one probably four or five

355
00:56:37.290 --> 00:56:50.850
Jason Swedlow: You know, basically, understanding that is really super hard. And so that gets to be a hard, hard. So we've had some success with CGI. And we're very grateful for that. But the reality is we're going to have to

356
00:56:51.300 --> 00:57:05.880
Jason Swedlow: Work on how we present this actually very serious way not only to fundraise for to our institutions as well. Right. So, I mean, you know, selling this within the various institutions that you all work for is going to be a challenge.

357
00:57:07.680 --> 00:57:18.630
Jason Swedlow: But you know that's that's just part of it. And there was one final point I can't remember what it was. But yeah, I think there's just the reality. Many of the things that we're talking about here are

358
00:57:21.030 --> 00:57:31.740
Jason Swedlow: are reflective of the where we are sorry, the last point with respect to API. I've also the person who ends up having to fund bio formats.

359
00:57:33.690 --> 00:57:48.360
Jason Swedlow: We have a lot of experience in using bio formats and it are in the image data resource, you know, the idea of an API and just random formats underneath is is running out of steam it just it it really is too much to ask

360
00:57:50.370 --> 00:58:02.970
Jason Swedlow: With you know when when driven by the data with the complexity of the data coming in the sheer size is part of it. But part of it was all the other things that we're discussing so

361
00:58:04.320 --> 00:58:17.280
Jason Swedlow: I think any are experienced that. I think everybody on the team agrees with me, if not say so. But we need some kind of convergence at that at that at that storage level.

362
00:58:18.660 --> 00:58:35.310
Jason Swedlow: So that in in in partially responding to Davis's point, which I'm sure we all agree, is that the having infinite divergence at that level is is going is we're seeing that run out of steam so file formats basically can't keep up.

363
00:58:36.510 --> 00:58:45.570
Jason Swedlow: When you're working at scale, right, so that's that's the sheer reality, we see that all the time and it cetera. Sorry. So at several points, but

364
00:58:47.100 --> 00:58:56.010
Jason Swedlow: There will be multiple formats. I think I like, I think it was Josh said, you know, CAN WE NAIL down exactly what are the things that we're going to need

365
00:58:57.120 --> 00:59:05.220
Jason Swedlow: To satisfy this range of requirements so you know Bill's got one end, which is actually extremely important. I mean, that's a key biological problem.

366
00:59:05.700 --> 00:59:17.220
Jason Swedlow: With public data resources have another end have another end several other dimensions to think about there. But, you know, if we can nail those down. I think that would that would really help us decide a half before

367
00:59:20.070 --> 00:59:23.700
Bill Katz: So Jason, if I could comment on what you just said about

368
00:59:24.720 --> 00:59:31.980
Bill Katz: You know the explosion of the file formats. There was a and I probably should have posted the links on that there was

369
00:59:32.640 --> 00:59:41.640
Bill Katz: Two blog posts one by open ML about what would be the ideal data format for open machine learning data sets.

370
00:59:42.090 --> 00:59:58.350
Bill Katz: And then there was a response by the tile DB founder about why they should consider tile DB as a universal data format and it was primarily focused on. And I would say that the one argument, which I kind of agree was with Stavros on is that

371
01:00:00.180 --> 01:00:02.190
Bill Katz: The towel DB approach, which is

372
01:00:02.820 --> 01:00:14.730
Bill Katz: The focus should be on data engines and API's over file formats, because at least with data engines and my data engine. I just kind of mean that it could be a library, which could be like an embedded library.

373
01:00:15.060 --> 01:00:21.660
Bill Katz: Where you don't actually know what's going on. But you, you have an API, you can ask for things and it automatically figures out

374
01:00:21.960 --> 01:00:30.300
Bill Katz: How to get the information that you need, or it can be an HTTP API or some kind of service, which also gives you that ability to kind of like

375
01:00:31.080 --> 01:00:38.460
Bill Katz: You know, change everything on the back end. Fine. You can change the file formats, but your request on the client side won't change and

376
01:00:39.390 --> 01:00:57.030
Bill Katz: So I i think that that was the tile DB approaches that we should focus on that kind of effort and I'm curious because it almost seems like that might be an interesting you know place to start thinking about it file format versus data API embedded engine.

377
01:00:57.480 --> 01:01:09.570
Jason Swedlow: So just to say, I mean, looking at the, at least from the group that I know I think Melissa and Aaron had the most experience with Tyler DB. So, Glenn ko's file formats to raw converter already support style to be

378
01:01:12.390 --> 01:01:17.130
Jason Swedlow: misspeaking so Aaron's it's not so yes it is something we're definitely looking at

379
01:01:17.610 --> 01:01:17.850
Bill Katz: I mean,

380
01:01:18.390 --> 01:01:29.340
Bill Katz: The interesting aspect is that we're already kind of seeing this in that there's a number like for example the end five implementation apparently supports czar in the back end design, implementation in

381
01:01:30.090 --> 01:01:36.660
Bill Katz: Supports and five that you have tensor store now and what Jeremy's doing with neuro Glanzer is he supporting

382
01:01:37.020 --> 01:01:50.280
Bill Katz: My god, he's supporting David boss, you know, a couple of other connectomics APIs as well as pre computed as well as in five and czar. And so what we're why you know what we're seeing is that the

383
01:01:50.610 --> 01:01:59.010
Bill Katz: That the data engine, the embedded libraries are basically doing the job of like figuring out all the file formats.

384
01:01:59.100 --> 01:02:06.930
Josh Moore: We so we should probably. So this sounds like something we should do a breakout on and I mean I'm more than happy and take time and

385
01:02:07.680 --> 01:02:20.610
Josh Moore: I'm even in the next, you know, days or two. My experience has been when you start this eventually you always get back to the question of who's taking the responsibility that biopharma it's currently has it doesn't go away.

386
01:02:21.120 --> 01:02:32.070
Josh Moore: And currently we have it. We're trying to get rid of it. And it's going to fall on someone else if we don't solve this, if we're trying to be good about this right and go, you know, you want to get rid of this. No one should suffer the pain.

387
01:02:33.480 --> 01:02:42.690
Josh Moore: It always becomes as long as you have a number of implementations and I've worked heavily with the in five czar and the czar in five implementations and they have the same problems, they don't

388
01:02:43.170 --> 01:02:55.470
Josh Moore: It's all hard coded at the moment. And it's just barely working. So we need a better way, whether it's the engine or a format where we're definitely not there yet. Sorry, dumb year has really been struggling and then Catarina

389
01:02:57.150 --> 01:03:05.820
Damir Sudar: You're probably asking the same the same thing or proposing the same thing. And so this is pretty much by the

390
01:03:07.230 --> 01:03:29.280
Damir Sudar: Way the separation is to the carbon part of the storage format, whatever its file or API or whatever. And the, the implementation of white pixels. Go. So if we can think of moving as much of the thing that should be as common as possible into the meta data layer and as little as possible.

391
01:03:30.780 --> 01:03:43.620
Damir Sudar: In the implementation of where pixels go layer then potentially we we can move as much to a common, common API or file format as

392
01:03:44.220 --> 01:03:53.250
Damir Sudar: As we can achieve. And then it becomes maybe a little easier to just swap out that pixel storage layer underneath.

393
01:03:53.670 --> 01:04:01.980
Damir Sudar: When it's needed, or have support multiple ones if that's really how it needs to be. And clearly from the discussion. It needs to be that way.

394
01:04:02.550 --> 01:04:11.760
Damir Sudar: So, but push as much as possible into the metadata layer, which should be common and there should not be multiple versions off or multiple access points to

395
01:04:13.890 --> 01:04:13.980
That

396
01:04:23.370 --> 01:04:24.480
Caterina Strambio: Can I speak.

397
01:04:28.560 --> 01:04:29.490
Josh Moore: My mic was awesome.

398
01:04:31.020 --> 01:04:32.850
Caterina Strambio: Okay, now I just wanted to

399
01:04:35.280 --> 01:04:37.860
Caterina Strambio: Just a point of clarification for me.

400
01:04:39.360 --> 01:04:48.120
Caterina Strambio: Maybe others. So we're talking about, you know, the combination. The, the, kind of the dichotomy between API versus format.

401
01:04:50.070 --> 01:05:00.870
Caterina Strambio: The but I'm trying to understand. So if we would have multiple because we decided that there is not going to be a single new format. So it's going to be next generation file formats.

402
01:05:01.950 --> 01:05:04.260
Caterina Strambio: Then there will be API between those

403
01:05:05.460 --> 01:05:08.670
Caterina Strambio: So not. I mean, or not. Am I am is understanding something

404
01:05:11.400 --> 01:05:11.730
Josh Moore: Maybe

405
01:05:11.970 --> 01:05:16.920
Caterina Strambio: They will, they should be translation. I mean, if we decided there is that needs to be more than one

406
01:05:18.360 --> 01:05:29.550
Josh Moore: From our side. Yeah, they will. If there are multiple formats in the oil me space, you will have to be able to translate between them. So that's a given. So from the API side.

407
01:05:30.300 --> 01:05:39.360
Josh Moore: I don't know is does everyone feel comfortable that there's one clear API that represents all the needs and you don't need to map between multiple API's. I find that

408
01:05:39.720 --> 01:05:40.770
Josh Moore: Amazing but

409
01:05:40.950 --> 01:05:44.820
Davis Bennett: I think for dense image data that's reasonable.

410
01:05:46.170 --> 01:05:47.280
Davis Bennett: Like if you think of

411
01:05:47.640 --> 01:05:49.140
Davis Bennett: An image as being a

412
01:05:50.250 --> 01:06:00.120
Davis Bennett: Script of pixels. The job of something, some software that consumes the image is just to ask for some region of pixels and it gets the values back

413
01:06:00.840 --> 01:06:11.910
Davis Bennett: Then a separate query for metadata. I think that's abstract enough to cover any 9% of what people do what what applications do that consume dense images are stuff and other things I think is

414
01:06:12.060 --> 01:06:14.280
Josh Moore: More calm. Yeah. So I think that's an extra call right so

415
01:06:15.300 --> 01:06:17.430
Bill Katz: Yeah, I think that usually as

416
01:06:18.480 --> 01:06:27.990
Bill Katz: Multiple API's where one API is particularly for a particular use case. So for example, you might have an n dimensional array API.

417
01:06:28.440 --> 01:06:43.740
Bill Katz: Or a multi dimensional. I mean, sorry multi scale multi dimensional array API. And that serves you know most of what you might need for in the image format. But if you wanted to do, for example, like a log

418
01:06:45.060 --> 01:06:57.150
Bill Katz: API that might be a little bit different and the format of the data might be different, like if you just wanted to replay all the commands are all the issues. And so, but one that's one of the other things, too, is like

419
01:06:58.080 --> 01:07:08.430
Bill Katz: What is the characteristics of the data and how would you have an API. What is the minimal set of API is necessary for the data that we need in science.

420
01:07:15.660 --> 01:07:22.110
Ola Tarkowska: I think to answer your question is minimum API could be the one that describe data life cycle.

421
01:07:24.240 --> 01:07:45.570
Ola Tarkowska: Because we speak a lot of about data but data life cycle could be also taken today account where we actually that this defines the purpose of each workflow of the, you know, of the process, how data is transformed, if that makes sense. This is different approach.

422
01:07:48.540 --> 01:07:53.850
Bill Katz: I mean, to be TV more concrete about like systems that we have to to make on our side.

423
01:07:54.780 --> 01:08:03.180
Bill Katz: We have a different API and a dip in and supported by different data type for three dimensional annotations. So for example, if we need to do synapses.

424
01:08:03.480 --> 01:08:11.520
Bill Katz: Those are a series of 3D points that are sparse. They may have characteristics or properties associated with it, like they're associated with a label or

425
01:08:11.760 --> 01:08:23.910
Bill Katz: That they're considered pre synaptic post synaptic etc. So, so, but that's a very different type of data that may be extremely large like you could have 100 million of these data points is more like a point cloud.

426
01:08:24.240 --> 01:08:42.090
Bill Katz: kind of scenario. And then we have a different data type in a different API for our n dimensional dense stuff. We have a different API. If we wanted sparse volume representation. So for example, if you needed, like a neuron that stretched over large periods of, you know, space.

427
01:08:43.110 --> 01:08:51.540
Bill Katz: In each of these also in terms of like how you'd want to store it might be stored differently because you know they're there are different approaches.

428
01:08:56.460 --> 01:09:05.190
Davis Bennett: I might want to make might be a historical point. But I think what's unique about the types of data that we work with thinking about microscopy data.

429
01:09:05.850 --> 01:09:14.220
Davis Bennett: Is that I don't think private industry has needed to solve the problem of sharing and interrogating enormous by Crosby data sets.

430
01:09:14.940 --> 01:09:25.260
Davis Bennett: So unlike a situation like a database for records database for transactions think these things, these things exist in industry good technical solutions there.

431
01:09:25.890 --> 01:09:35.250
Davis Bennett: But I feel like really large images that come off the microscopes. We use there's there's no precedent, I think. So, we will have to come up with something

432
01:09:36.270 --> 01:09:42.960
Davis Bennett: On our own for other things like the point clouds, maybe best solved with tools that already exist that have been solved in other domains.

433
01:09:47.430 --> 01:09:58.470
Mark Kittisopikul: Oh, let's say I was funny. Maybe try to verify something about API is is I think many of the APS we work with are also bound to a specific program language.

434
01:09:59.040 --> 01:10:11.160
Mark Kittisopikul: And something Davis and I have talked about in the past is maybe moving towards more of a a service model where maybe something like a web service, even where you're more speaking protocol.

435
01:10:12.510 --> 01:10:14.310
Mark Kittisopikul: Or Can think of something like a mirror.

436
01:10:15.330 --> 01:10:21.090
Mark Kittisopikul: So you have a server that's a couple actually parse whatever format you want and provide you the

437
01:10:22.590 --> 01:10:23.370
Mark Kittisopikul: pixel data.

438
01:10:25.350 --> 01:10:33.840
Mark Kittisopikul: And so that's that's another thing. The other thing I've noticed and talking to some manufacturers is there's some willingness to support

439
01:10:35.310 --> 01:10:40.770
Mark Kittisopikul: Support the users if they could be provided some kind of interface.

440
01:10:42.000 --> 01:10:51.360
Mark Kittisopikul: So for example, we were talking to Karen manufacturer about maybe using kind of block based format. And they said, Well, you know, just send us a DLL

441
01:10:52.410 --> 01:10:55.710
Mark Kittisopikul: And will we, you know, we'd be willing to plug into that.

442
01:10:57.270 --> 01:11:00.360
Mark Kittisopikul: And so I think there's there's someone has to maybe think of

443
01:11:02.100 --> 01:11:09.300
Mark Kittisopikul: Some kind of modular ecosystem where, you know, we can define

444
01:11:10.320 --> 01:11:18.690
Mark Kittisopikul: Maybe abstractly, the interface of there might be some willingness to build modules that connect to those interfaces.

445
01:11:21.240 --> 01:11:36.390
Mark Kittisopikul: And I guess one other point about API's. So I think that, you know, there are some, there's some of the things that we need to be aware of in terms of the legal aspects of using API is is currently working as a way through the US Supreme Court.

446
01:11:37.710 --> 01:11:50.910
Mark Kittisopikul: Specifically kind of with the Java or Google or support local matter that we need to be cautious about and making sure that we do provide these interfaces that we make sure that they are free to use

447
01:11:54.060 --> 01:11:54.420
Josh Moore: Dummy

448
01:11:59.880 --> 01:12:02.700
Damir Sudar: Sorry, I forgot to lower my hand. Sorry. Okay.

449
01:12:09.990 --> 01:12:11.130
Josh Moore: So wherever we ended up

450
01:12:16.710 --> 01:12:26.490
Josh Moore: Sounds like we certainly need so from my side. I mean, it probably means an investigation and maybe a learning experience of the API's that exists. So if anyone wants to, you know,

451
01:12:27.720 --> 01:12:40.620
Josh Moore: Volunteer to either teach me directly, or maybe set up a little lightning talk and walk us through it. That would actually be really valuable and would help me understand, are we actually seeing the same thing. And we're just seeing it in different ways, or

452
01:12:41.820 --> 01:12:55.200
Josh Moore: Are we in two completely different worlds or I think the most likely, and I think the most optimal. You know, is it, is it a subset, and a superset that we can actually work together quite nicely. That'd be my hope I'm

453
01:12:58.920 --> 01:12:59.760
Josh Moore: So that's fair.

454
01:13:02.640 --> 01:13:10.560
Josh Moore: We still have a couple more topics that we have about 45 minutes so you wouldn't want to Buddhist to something completely different. So we try to make

455
01:13:11.550 --> 01:13:25.980
Josh Moore: kind of wake up. I actually was warned that in the previous session, we should we should have built in a break. About half. So the next time we have a call. I'll build into the schedule. So at least a few minutes to get away and get something to drink.

456
01:13:29.910 --> 01:13:41.910
Josh Moore: Mean next on the list of most people who are interested was actually just talking about the pyramids in the multi scale representation. Is that still valid. Does anyone have any particular. They want to know. I want to share about those

457
01:13:44.340 --> 01:13:49.050
Davis Bennett: I just want to say I'm excited about the army specification

458
01:13:50.340 --> 01:14:04.770
Davis Bennett: Cool one I am thinking about asking the Neuromancer developer to support that, because I don't. He has implemented any pyramid specification for czar containers, maybe Eric Perlman can speak to that.

459
01:14:07.110 --> 01:14:17.010
Eric Perlman: I can speak to that, that he's support Jeremy has support for Saul films original sort of hacky and five scale down samples.

460
01:14:17.250 --> 01:14:27.360
Eric Perlman: Like whatever it was. That's all felt that additionally with the FBI data judge john is very well aware of. He supports that. And so one of the things I've talked to Josh more about is it would be straightforward to

461
01:14:28.050 --> 01:14:39.480
Eric Perlman: Add a dialect of that using for the Tsar data source. I mean, basically, we just copy and paste the five as a sample. The bigger issue of neuronal answer is actually that he currently requires

462
01:14:41.040 --> 01:14:52.680
Eric Perlman: All the dimensional dimensions be in the same chunk. This is not an issue if you're doing DM data, but to be an RGB data currently at the moment in the LM a world like a bio formats to raw

463
01:14:53.430 --> 01:14:57.210
Eric Perlman: RG and be an alpha are going to be four distinct chunks.

464
01:14:57.480 --> 01:15:07.410
Eric Perlman: And so when I have done this, I have to write to disk and then reformat it to the right size but Jeremy's. I mean, it's one of those things where, unfortunately, Jeremy is the only one in the universe, who could

465
01:15:07.740 --> 01:15:13.140
Eric Perlman: Fix your answer to support that so well. So anyway, that's probably something to ask him.

466
01:15:16.350 --> 01:15:26.160
John Bogovic: I'm also just I'd like to say that I'm excited and I hope we agree on what we do about pyramids in the future, but that's essentially all I'm okay. If we move ahead.

467
01:15:27.150 --> 01:15:40.920
Eric Perlman: Yeah, I mean the pyramids basically been two dialects. There was the model of for each resolution you say what the new resolution is versus the other dialect, which is for each resolution you say how much it was down sampled

468
01:15:42.030 --> 01:15:45.900
Eric Perlman: I mean, they're fine functionally the same but it's been kind of those have been the two camps.

469
01:15:45.960 --> 01:16:00.960
Josh Moore: You just kind of have to make a decision at some point. So for anyone who wasn't part of it. So there was actually a quite protracted conversation on GitHub. I mean, if you follow the links from image SC, you can get back there. So all of that's archived right for posterity. Um,

470
01:16:02.970 --> 01:16:09.840
Josh Moore: So from my side. Yeah, it was, it was good to just get everyone to agree, make a decision, say this is what we're going to do going forward.

471
01:16:10.410 --> 01:16:22.800
Josh Moore: Um, the more pieces of software that we can get writing reading and writing the same format, the better. So if anyone's blocked in any way on making that happen. Speak up. And let's try to tackle that

472
01:16:23.100 --> 01:16:30.780
Josh Moore: Sounds like we need time from Jeremy. So, okay, I don't know what what has to happen for that to occur. Maybe its funding. I don't know.

473
01:16:31.500 --> 01:16:37.110
Josh Moore: I'm from my side. I do know that and one was talking about that this morning.

474
01:16:37.590 --> 01:16:54.180
Josh Moore: You know there's we we punted on the scale factor. The scale origin and offset metadata from the multi scale specification. I assume that's blocking some people and so sorry john I'm so whoever would like to, you know,

475
01:16:55.830 --> 01:17:02.490
Josh Moore: I have a couple of things more things on my plate. But if someone wants to go. Okay. You know, I have an opinion. This is how we should do it basically

476
01:17:02.850 --> 01:17:10.380
Josh Moore: Bringing back that conversation from the Tsar spec repository that would be a great way to make it happen. So basically we need

477
01:17:11.250 --> 01:17:19.020
Josh Moore: To run the conversation a little bit, get the consensus, again, I think the implementation is easy. You know, it's not really a problem. Once you've made the decision.

478
01:17:19.560 --> 01:17:28.950
Josh Moore: And then we go from there. So maybe we can get that done during 2020 that might be a nice way to kind of top off the year, um,

479
01:17:29.820 --> 01:17:30.510
Josh Moore: Nico, you're

480
01:17:30.660 --> 01:17:34.230
Josh Moore: Raising your oh actually several hands Nico then Trevor then Jamie, sorry.

481
01:17:36.120 --> 01:17:41.820
Nicolas Chiaruttini: Maybe I missed what you're talking about the description retailing the specification that happened somewhere on GitHub.

482
01:17:41.910 --> 01:17:42.420
Right.

483
01:17:44.160 --> 01:17:49.950
Nicolas Chiaruttini: Okay, so then, yeah, I just put a link, because then I would be interested to see

484
01:17:51.090 --> 01:18:00.510
Nicolas Chiaruttini: Regarding mercury resolution. So I don't know, I just use the bio formats API, which is working fine. And I did a bridge with a big data mover.

485
01:18:00.960 --> 01:18:11.820
Nicolas Chiaruttini: So, which is like an iframe transformed from different scale what I was wondering if you discuss this matter is when you do reduction of show pictures towards one hour block.

486
01:18:12.960 --> 01:18:15.840
Nicolas Chiaruttini: Usually you compute the average of the pizza.

487
01:18:17.070 --> 01:18:26.280
Nicolas Chiaruttini: So then you get the average of eight pixels or whatever. But then I was wondering whether there was a possibility or it needs to be specified

488
01:18:26.670 --> 01:18:37.770
Nicolas Chiaruttini: Whether you want to do rejection with the maximum value. For instance, or with a minimum value or with something different. So meaning specifying the process by which you're reducing

489
01:18:38.220 --> 01:18:44.430
Nicolas Chiaruttini: The scale of your object because I found that it's an issue perform on his average

490
01:18:44.880 --> 01:19:02.070
Nicolas Chiaruttini: Then your structure when you downscale lot then basically you have Reggie zero but issue. Do a rejection with maximum value then you end up with a sort of maximum projection in three. So I don't know if I'm clear

491
01:19:05.400 --> 01:19:07.200
Josh Moore: I think you're beyond me, but I think

492
01:19:08.310 --> 01:19:10.230
Josh Moore: We should try this, you know, is

493
01:19:10.440 --> 01:19:13.470
Josh Moore: There something that you want to see in the spec, like there's a way, you know,

494
01:19:13.650 --> 01:19:16.650
Josh Moore: Depending on what algorithms being used to do the down sampling

495
01:19:16.860 --> 01:19:18.780
Nicolas Chiaruttini: If you want to store what you did.

496
01:19:19.230 --> 01:19:24.900
Josh Moore: We just needed. We needed example basically of what you would like to capture and then let's try to work it into the spec so

497
01:19:26.070 --> 01:19:34.350
Eric Perlman: That would be a great thing in the spec. I mean, I can think of averages max and then also there's the majority down sampling for label data masks.

498
01:19:35.580 --> 01:19:46.050
Eric Perlman: So yeah, I think basically that should be added to that conversation on get how that that's something that should be in the spec. It's probably again one of those things which most people will not use most people will be doing

499
01:19:46.350 --> 01:19:49.440
Eric Perlman: Average down sampling, but it's good to have that meta data stored

500
01:19:50.400 --> 01:20:03.600
Nicolas Chiaruttini: For 3G display, then the maximum taking the maximum pixel out of a chunk can be can be useful for 3D representation of downscaling it, but I can, yeah.

501
01:20:05.070 --> 01:20:11.670
Eric Perlman: I completely agree. By world lately has been series of two dimensional down sampling. So it's not as important, but I totally agree.

502
01:20:12.390 --> 01:20:14.010
Josh Moore: Okay, I think we're on Trevor. And then, Jamie.

503
01:20:16.290 --> 01:20:22.740
Trevor Manz (he/him/his): All right, and bring it the brilliant, but not quite on multi scale anymore, but I have them a little bit of

504
01:20:24.300 --> 01:20:31.440
Trevor Manz (he/him/his): There was a little bit of discussion about, you know, potentially storage types, not being so much like thinking more in terms of data API's rather than format.

505
01:20:32.040 --> 01:20:44.610
Trevor Manz (he/him/his): And I have done a little bit of experimenting with that of using sort of dar as a as a data API layer, rather than and expose like basically using a small micro service.

506
01:20:45.090 --> 01:21:03.720
Trevor Manz (he/him/his): To translate chunks from other format to czar so that our web applications can do it. And I guess my comment is just that there are possibilities to have sort of a data API that mirrors some native format. But also, you know, have all these different types of binary containers as well.

507
01:21:04.950 --> 01:21:11.520
Trevor Manz (he/him/his): But I have done a little bit experimenting with that. I just wanted to comment if anyone wanted to follow up or any question.

508
01:21:13.050 --> 01:21:15.270
Josh Moore: Yeah, I have great hope and what Trevor's been doing so.

509
01:21:17.250 --> 01:21:20.550
Josh Moore: Very cool to make everything just turned into a czar. Go for it. Jamie

510
01:21:21.810 --> 01:21:25.920
Jamie Sherman: Okay, just to two questions, or one question. One comment.

511
01:21:26.940 --> 01:21:30.000
Jamie Sherman: So one is with the pure with a different pyramid data.

512
01:21:31.590 --> 01:21:49.740
Jamie Sherman: Is the general practice when you're pulling from a raw from an instrument manufacturer format to just mere over there are other parents like I'm working. I work on one library that does this. And at the moment I ignore anything that's not pyramid zero and so like

513
01:21:50.760 --> 01:21:55.950
Jamie Sherman: When you're mapping to a common format is this is the general practice to just say, Okay, I'm going to take

514
01:21:56.400 --> 01:22:04.830
Jamie Sherman: Pyramid zero and document how I transform it to a different pyramid level or is it to take the manufacturers image format.

515
01:22:05.670 --> 01:22:16.170
Jamie Sherman: As like it's whatever other pyramid levels that they've encoded and pull them across and put them in the data because I'm a little concerned in that case that you then end up with this like

516
01:22:16.830 --> 01:22:23.820
Jamie Sherman: Yeah, I got this out of something that uses a proprietary like where the information on how it was generated isn't necessarily shared

517
01:22:25.950 --> 01:22:29.310
Jamie Sherman: And so that's that's one kind of common question.

518
01:22:30.570 --> 01:22:37.320
Jamie Sherman: And then the other one is, and this is just because it's come up in this documenting

519
01:22:37.980 --> 01:22:56.940
Jamie Sherman: Down sampling and that kind of thing. I know that there are people at the Institute. I work on that are working on essentially ML transforms where you take lower resolution data and you try and up transform the resolution using an ML algorithm or something like that to

520
01:22:58.050 --> 01:23:15.720
Jamie Sherman: Their pros and cons etc of this kind of methodology, but being able to like link to the method or encode that this isn't actually the raw data or that it's a transformed form of data that would be helpful later it's something to be aware of and think of at least

521
01:23:19.530 --> 01:23:21.510
Josh Moore: Melissa, you want to take the bio formats to we're all

522
01:23:21.540 --> 01:23:22.290
Jamie Sherman: Part of this

523
01:23:23.220 --> 01:23:33.000
Melissa Linkert: Sure. So I'm to your question about whether we, you know, transfer over the any proprietary pyramid that was in the original file, we do not at the moment.

524
01:23:33.450 --> 01:23:44.160
Melissa Linkert: We only take the largest resolution and then it's down sampled from there for all of the reasons that you mentioned, you know, we can't really rely on

525
01:23:44.970 --> 01:23:58.560
Melissa Linkert: How that pyramid was generated. It may not be consistently down sampled across resolutions. So you may have different sizes it between different pyramid levels. It may not be down sampled

526
01:24:00.210 --> 01:24:11.670
Melissa Linkert: You know, at the skill that you would like to see in your final output file. There's just there are all sorts of reasons why it's unreliable to do that. So we just throw everything except the largest resolution away.

527
01:24:13.380 --> 01:24:20.550
Melissa Linkert: We do in both formats to raw document or include in the metadata. The down sampling algorithm that was chosen

528
01:24:21.000 --> 01:24:36.360
Melissa Linkert: There's a few different options for the most part, were using open CV to kind of do the heavy lifting for all of that but whatever algorithm you choose. That's going to be recorded in the metadata that winds up in the bizarre or and five container there.

529
01:24:38.280 --> 01:24:38.610
Melissa Linkert: Excellent.

530
01:24:38.700 --> 01:24:39.090
Melissa Linkert: Thank you.

531
01:24:40.650 --> 01:24:42.120
Josh Moore: And then, in general, I think, you know,

532
01:24:43.140 --> 01:24:48.240
Josh Moore: Several of you have touched on all kinds of things that could go into the metadata. Eventually, um,

533
01:24:49.830 --> 01:25:00.180
Josh Moore: We're currently trying to keep things as simple as possible. You know, so we're not going to try to specify upfront. You know how to store the ML model that you're using to do the down sample or the up sampling

534
01:25:00.870 --> 01:25:10.170
Josh Moore: Um, but I think those are interesting conversation. So from our side, we will try to get an infrastructure in place that allows the metadata to be written down.

535
01:25:11.460 --> 01:25:24.660
Josh Moore: In these formats and access via these API's and then and then all hell's going to break loose right but it's going to be fun. And that's what we need to do. So we'll get there. Davis I think you raised your hand.

536
01:25:27.960 --> 01:25:28.590
Josh Moore: Oh, you're muted.

537
01:25:29.460 --> 01:25:40.320
Davis Bennett: So I, I'd like to offer a contrarian perspective and maybe thinking pessimistically about putting the mechanism by which the data was generated in the multi resolution metadata.

538
01:25:40.890 --> 01:25:47.730
Davis Bennett: Like I it's not what the specific use case is for knowing that it was the mode or the mean and then I worry about

539
01:25:48.600 --> 01:25:58.470
Davis Bennett: The possibility of arbitrary. I mean, you can imagine arbitrary function that eight pixels and generates one pixel out of that. And so there will need

540
01:25:59.100 --> 01:26:07.170
Davis Bennett: Touches for this metadata field if somebody can't generate a string that someone else understand that they use to generate the down sampling

541
01:26:07.620 --> 01:26:17.130
Davis Bennett: So my philosophically my perspective has been a multi resolution pyramid literally just is a collection of images that should be able to stand on their own.

542
01:26:17.730 --> 01:26:24.330
Davis Bennett: And it's some application somewhere that thinks this thing forms a pyramid. But for me, the guiding principle was in the metadata.

543
01:26:25.050 --> 01:26:36.630
Davis Bennett: metadata that describes that image where it is in space and nothing else or arbitrary that could apply to any image. So that's, that's just me being contrarian, though. Yeah.

544
01:26:36.930 --> 01:26:47.760
Josh Moore: I think you're expressing what I'm trying to get out as well as I don't want to be the one responsible for it. I don't have a problem, putting a place where people can fill this out, you know, put something in there. But will the community know how to interpret it.

545
01:26:49.260 --> 01:26:50.340
Josh Moore: I guess they have to decide.

546
01:26:53.820 --> 01:26:57.930
Josh Moore: Okay, any other general theme of feelings on the multi scale representation

547
01:27:00.030 --> 01:27:07.230
Josh Moore: Sounds like we've somewhat segue into metadata. There are a couple people who have their names on it. Any thanks john

548
01:27:08.430 --> 01:27:10.740
Josh Moore: Any immediate thoughts Nico, you are first

549
01:27:12.780 --> 01:27:14.850
Josh Moore: The spline worked

550
01:27:15.000 --> 01:27:19.950
Nicolas Chiaruttini: On. So you just because we have this very big, huge images.

551
01:27:21.240 --> 01:27:30.960
Nicolas Chiaruttini: We don't want to receive them when they are transform and my application is like brand slice registration. And so I want to basically have a mercury resolution image.

552
01:27:32.160 --> 01:27:51.780
Nicolas Chiaruttini: I'm doing nothing and transform spines transform using your John's a big walk thing. And I just want to store the transformation. And so we can be like a few long points, then it's a very concise representation of a transform and I don't know if it has to be supported or not.

553
01:27:53.250 --> 01:27:57.840
Nicolas Chiaruttini: I think I find transform is very important to get the localization in in 3D.

554
01:27:59.040 --> 01:28:07.080
Nicolas Chiaruttini: That would come and add, I think that that should come with a time at which it was taken also time and space. That makes sense.

555
01:28:09.300 --> 01:28:19.290
Nicolas Chiaruttini: But then on top of transform. I was wondering whether we want to support sort of stoned out spline deformation meta data.

556
01:28:25.500 --> 01:28:27.750
Josh Moore: Is your hands up. Davis, or that up again.

557
01:28:27.870 --> 01:28:28.260
Damir Sudar: Go for

558
01:28:28.920 --> 01:28:35.520
Davis Bennett: Yeah, I would once again the the space of possible image creation is so huge.

559
01:28:37.170 --> 01:28:43.380
Davis Bennett: Very that if suppose you do support field metadata field for this specific type of image registration.

560
01:28:45.060 --> 01:28:58.830
Davis Bennett: Consider what fraction of images would field used. And then how many other registration modalities. There are like in the store a warp field. If you've done like a full, you know, a full nonlinear registration.

561
01:29:00.120 --> 01:29:09.660
Davis Bennett: I feel like image registration is like, it's like the medieval map where there's the ocean dragons, like the prospect of supply anything more than linear methods.

562
01:29:10.080 --> 01:29:18.600
Davis Bennett: That apply to the whole image makes me nervous. But maybe if there's something that 99% of images would would use it and the pain of not having in the metadata would be acute.

563
01:29:19.680 --> 01:29:22.680
Davis Bennett: Persuaded in the other direction. That's just my feeling

564
01:29:26.130 --> 01:29:26.490
Josh Moore: John

565
01:29:29.310 --> 01:29:31.380
John Bogovic: unmute. Okay. Yeah, so I think

566
01:29:32.730 --> 01:29:40.830
John Bogovic: I appreciate that we shouldn't define exhaustively all of the metadata that will ever be used like

567
01:29:42.150 --> 01:29:53.610
John Bogovic: Josh or me or no one should be responsible for everything. But whatever format, whatever we decide, should be flexible enough that one can include their own metadata.

568
01:29:54.060 --> 01:30:07.830
John Bogovic: For their own specific needs, because if we don't people like Nico or me or somebody else are going to need or want something and they're going to leave for another format or write a new format with this metadata.

569
01:30:08.610 --> 01:30:14.280
John Bogovic: For their particular purpose. So not every tool has to understand how to use

570
01:30:15.120 --> 01:30:23.010
John Bogovic: The spline that Nico writes into there, but they will be able to at least get the data out whatever data are there, they will be able to get out.

571
01:30:23.760 --> 01:30:36.090
John Bogovic: That is the data that are common. So I'm just, I like to strongly advocate for being as flexible for custom fields with custom content that not every tool.

572
01:30:36.690 --> 01:30:49.710
John Bogovic: Has to be able to read or write or understand whatever Nico or me or whoever writes into that should be documented by the user or creator of the tools that consumed that

573
01:30:51.240 --> 01:31:02.820
John Bogovic: But, and the fact is that if it's if it's written in a common way it will make it easier for other tools to potentially get it out like if it's much easier if say

574
01:31:03.540 --> 01:31:13.380
John Bogovic: Someone in Python wants to use something that Nico writes it will be easier for them to get at the data if they use the metadata of

575
01:31:15.060 --> 01:31:24.000
John Bogovic: Formats FOR THE WAY TO STORE AND READ A meta data, then it would be if Nicole wrote his own format. And that's one of the main reasons I think we should be flexible.

576
01:31:25.140 --> 01:31:32.100
Ulrike Boehm: So so john I completely agree with with like be having the flexibility. But one thing we should all also

577
01:31:32.730 --> 01:31:41.580
Ulrike Boehm: All agree on is like really having a proper documentation that people really in casing need something new, they want to work with something else that everything's really

578
01:31:42.450 --> 01:31:55.080
Ulrike Boehm: Listed in such a way that you can find your way around. And that I think is, I mean, independent of whether there's a diversity or so or as like diversity, it should should be so that people

579
01:31:55.650 --> 01:32:05.490
Ulrike Boehm: I mean that there's no people don't become desperate or so, and they should be also clear what's written in a documentation should because sometimes people define that whatever right like

580
01:32:06.210 --> 01:32:11.850
Ulrike Boehm: But it should be so that people can really also make use of that document, but just as a my two cents.

581
01:32:13.860 --> 01:32:15.390
Bill Katz: Here. It kind of reminds me

582
01:32:16.230 --> 01:32:20.280
Damir Sudar: I think I our next sorry so

583
01:32:21.330 --> 01:32:22.200
Damir Sudar: This is again a

584
01:32:24.810 --> 01:32:30.330
Damir Sudar: Push for the the parallel metadata storage storage discussion that's

585
01:32:30.990 --> 01:32:42.690
Damir Sudar: A nice as leading that in there. What we're trying to do is to have a basic set of metadata fields predefined and actually manage those, and have we really

586
01:32:43.320 --> 01:32:55.800
Damir Sudar: solidly defined and documented as they can to set, but then there is the possibility of extensions. Right. So the idea what Nico needs. Maybe it's just an A fine transform store.

587
01:32:56.310 --> 01:33:09.600
Damir Sudar: A lot of people need Justin I find transform that can be part of the the base metadata definition. But then if you indeed need some weirdo spline transformation that isn't captured

588
01:33:10.350 --> 01:33:20.700
Damir Sudar: Or isn't easily captured in a standardized way the extension mechanism within the metadata environment allows you to store your

589
01:33:21.270 --> 01:33:36.480
Damir Sudar: weirdo registration thing. So, so the idea is to have both, but to be never gets into the point where people say, man, it, it won't help me, and so I'm writing something completely different. So I'm completely on board with with what john just said.

590
01:33:37.560 --> 01:33:38.580
Damir Sudar: So that's it. Thanks.

591
01:33:41.100 --> 01:33:47.280
Jamie Sherman: Jay, okay, so like one like Davis's comment really made me think, and I

592
01:33:48.390 --> 01:33:58.950
Jamie Sherman: Think he's got a really good point. And one of the things that might be key to it is like having a binary fly essentially equivalent of a binary flag that says this is actual raw data.

593
01:33:59.550 --> 01:34:04.230
Jamie Sherman: And everything else is an interpolation interpreted data level.

594
01:34:04.860 --> 01:34:15.660
Jamie Sherman: Would be incredibly useful because then if you're an a naive user, you can just say, all right, I don't have time to evaluate what these other layers are or to derive how they got there.

595
01:34:16.200 --> 01:34:23.850
Jamie Sherman: But I just want to pull all this data in some model or something like that. And I want to know that it's real data without artifacts, then

596
01:34:24.390 --> 01:34:36.900
Jamie Sherman: You can you can find that easily in a trivial way and then obviously having, having the ability to add detail is great, but one of the things that does come to mind is like

597
01:34:37.440 --> 01:34:46.320
Jamie Sherman: How much of this do you want to be like, how much of an open format is cataloging what's been done to acquire the data. So it can be shared.

598
01:34:46.800 --> 01:35:00.150
Jamie Sherman: And how much of it is a lab notebook that's trying to capture everything that ever happened to the file which is kind of becomes a rabbit hole that can go on forever. It's that's just definitely a concern, anyway.

599
01:35:01.230 --> 01:35:07.770
Josh Moore: Just from my side, I would say, we would just put, you know, we'll just give the user paper and then from that point, they have to figure out what they're writing down so

600
01:35:09.600 --> 01:35:10.590
Josh Moore: It was back to Davis.

601
01:35:13.830 --> 01:35:16.170
Davis Bennett: Oh, I, I took my hand down, but I can

602
01:35:18.270 --> 01:35:19.410
Josh Moore: Bill is waiting as well.

603
01:35:19.650 --> 01:35:27.900
Davis Bennett: Sorry, it's just that I was just wondering, is there a way, if you're like a to have metadata. That's like for these extensions. Is there a standard not

604
01:35:28.620 --> 01:35:38.040
Davis Bennett: A specific tool like say neuro Glanzer or big war or whatever to have its metadata registered somehow. What's the state of the art for that.

605
01:35:39.210 --> 01:35:46.470
Josh Moore: So I have an opinion here. I mean, and this is basically part of what will come up in Catarina is call on the 20th, um,

606
01:35:48.990 --> 01:35:51.090
Josh Moore: How far do I want to go in this so

607
01:35:52.470 --> 01:36:00.360
Josh Moore: The experience with me XML has been that XML SSD doesn't or sorry yeah SSD.

608
01:36:00.990 --> 01:36:08.730
Josh Moore: doesn't provide what we need for the ability to extend so basically Catarina went through this huge process wrote a full extension of only the XML.

609
01:36:09.120 --> 01:36:15.990
Josh Moore: And the cost to everyone would have been so high, to actually support it. Right. I mean, it's basically this huge breaking change.

610
01:36:16.770 --> 01:36:30.810
Josh Moore: So my feeling is that JSON LD is the path forward and I have some work to do to make that proposal and show it to everyone. So that's on me and hopefully the person we hired to do this, um,

611
01:36:32.850 --> 01:36:37.830
Josh Moore: That's so the the major competitor to that would be JSON schema. I'm pretty

612
01:36:39.030 --> 01:36:46.560
Josh Moore: skeptical of JSON schema. At this point, after having spent over a year on the human cell Atlas project trying to use JSON schema for this.

613
01:36:46.800 --> 01:36:54.810
Josh Moore: Basically you have all the downsides of XML schema and none of the support that was built up by the W three see over you know decades.

614
01:36:55.800 --> 01:37:11.100
Josh Moore: The RDF our JSON LD community is WS three standardized does have a lot of support. I don't think the supports. Great. The other major downside is it's possible to write things that are very difficult

615
01:37:12.030 --> 01:37:19.260
Josh Moore: To understand so i think i. So, my ideal would be if we could find a way to have some subset of JSON LD.

616
01:37:19.620 --> 01:37:28.860
Josh Moore: I mean, we already want to put JSON into czar right so that's kind of pretty straightforward and then say this is the subset of JSON LD that we're going to support

617
01:37:29.580 --> 01:37:36.270
Josh Moore: One of the key values of it is its extensive ability. It's an open world model, you can always talk about anything. So basically,

618
01:37:36.690 --> 01:37:47.280
Josh Moore: From as the point of oil me currently. So as you know, as it when we we we are the the gatekeepers. We keep people out of the specification

619
01:37:47.760 --> 01:37:51.720
Josh Moore: By force, you know, we just, we can't let anyone write whatever they want.

620
01:37:52.560 --> 01:37:59.790
Josh Moore: The open world model is exactly the opposite. Anyone can say anything they want. So I think that gets us exactly the sensibility that we need

621
01:38:00.540 --> 01:38:15.150
Josh Moore: Um, will end in ultimate chaos. I hope not. I'm that's certainly my word bill on this, let Katarina speak since I talked about her and then get back to you. And then we're slowly need to start wrapping up

622
01:38:17.610 --> 01:38:30.480
Caterina Strambio: Yeah. So I mean, yeah, so there is going to be follow up on the discussion on on the 20. Well, I think it's going to be on the 20th of November, but we still have to finalize any way you can look it up on images see

623
01:38:31.890 --> 01:38:49.440
Caterina Strambio: But, um, yeah. So, I mean, I think that method I didn't jump in because the meta data problem is like there is multiple things that are defined meta data and the one that we have worked on. As for the end initially for the nuclear munition in then Bina, and now support of quiet.

624
01:38:50.580 --> 01:39:00.360
Caterina Strambio: With with the mirror and Enrique and other many other and several other people is the microscopy. So hardware settings quality control part

625
01:39:01.980 --> 01:39:08.610
Caterina Strambio: And the reason why I mean there was a need for the community to

626
01:39:09.870 --> 01:39:20.820
Caterina Strambio: chime in on what is currently being represented in the 2016 version. And we used the access D because that was the that time.

627
01:39:21.870 --> 01:39:33.420
Caterina Strambio: The what was how the mother was represented and but obviously these all along. We've been very interested in moving past that. So,

628
01:39:35.610 --> 01:39:43.650
Caterina Strambio: First, I mean, it is also one of the topics that we will discuss is, you know, is it necessary to have multiple representation. And how do we make sure that they keep

629
01:39:45.870 --> 01:39:48.540
Caterina Strambio: They keep, you know, align with each other.

630
01:39:50.730 --> 01:39:57.660
Caterina Strambio: Anyway, so, um, I don't want to take too much time in in this space. I mean, we can talk about it more

631
01:39:58.770 --> 01:39:59.130
Later.

632
01:40:00.660 --> 01:40:01.890
Caterina Strambio: But anyway, um,

633
01:40:04.020 --> 01:40:15.930
Caterina Strambio: I think that the representation of the meta data is a little bit of a convergence between a couple of different world. One is, what do we want to represent. And so, for example, the community that microscopy community.

634
01:40:16.290 --> 01:40:26.520
Caterina Strambio: Has to chime in on what needs to be represented for the fried defining the hardware and the settings and the quality control versus how do we are presented. And so, you know,

635
01:40:27.480 --> 01:40:44.850
Caterina Strambio: Having different presentation has been necessary for our work because different people. Not everybody can look into a access the file or even in the JSON LD file. So we need a way of for people to interact with the content and

636
01:40:46.290 --> 01:40:46.500
Yeah.

637
01:40:48.030 --> 01:40:48.300
Josh Moore: Okay.

638
01:40:48.330 --> 01:40:52.170
Caterina Strambio: Oh, absolutely, that that has to be human readable.

639
01:40:52.860 --> 01:40:58.500
Josh Moore: So I'll add the link to cut arenas image SC post. And then below that you wanted to say.

640
01:40:59.760 --> 01:41:11.880
Bill Katz: I just had a question, mostly because it was prompted by what Nicholas was saying where, you know, there was this basically custom transformation this, you know, and

641
01:41:12.480 --> 01:41:24.990
Bill Katz: Sort of tying it into compression as well, that if you have an extension or some sort of customized way of, you know, that, that the data actually has to be run through something else.

642
01:41:25.980 --> 01:41:33.630
Bill Katz: I assumed that because I'm not that familiar with all the different types of metadata. Is there something similar to like what we would do with

643
01:41:34.170 --> 01:41:55.350
Bill Katz: Source Code packaging where there's like a UI or something that corresponds to a particular you know like a unique identifier for an extension code base that that would have to be sort of, you know, a plugin that has to be accessible so that someone could actually get the data.

644
01:41:55.890 --> 01:42:07.680
Josh Moore: So at the compression level. Yes. So currently it's fairly hard coded in the V2 spec of czar on the next version of czar that everyone's working on

645
01:42:08.520 --> 01:42:19.410
Josh Moore: Does have a more formal. I think pearl based extension registry so you you know you you get a pearl URL that's the the source of your

646
01:42:20.100 --> 01:42:31.380
Josh Moore: Probably shared library or in Python, you know, just code that you can load and that does the the filtering, or the compression or you know whatever process, it is you want to apply to every chunk.

647
01:42:32.370 --> 01:42:44.250
Josh Moore: At the metadata level, at least in the JSON LD world you have the concept of a context. So you should be able to go to a URL online and load a definition of the metadata.

648
01:42:45.480 --> 01:42:53.160
Josh Moore: Whether or not you can interpret that representation basically depends on how smart your client is, you know, how much do you understand the vocabulary.

649
01:42:53.790 --> 01:42:59.340
Josh Moore: Arm so you know you could imagine if we stick to just the transforms

650
01:42:59.970 --> 01:43:11.040
Josh Moore: You know, if the I think DOM, you talked about a base transform class and I find transform subclass. And if there's a spline subclass that lives in a different context.

651
01:43:11.790 --> 01:43:19.590
Josh Moore: All the fields that are common amongst those you could interpret and maybe do something with them. But you'll need to have the concept of what's a required

652
01:43:20.400 --> 01:43:35.910
Josh Moore: You know, is this useful to me if I don't understand all the metadata. It gets it gets very tricky. Very quickly, as James is pointing out. So I don't know if we're going to be able to build clients, which can work with partial information. Not sure.

653
01:43:37.170 --> 01:43:47.400
Bill Katz: Is there any notion of sort of an extension perhaps being a service as opposed to a link to a piece of code that gets downloaded needs to be executed.

654
01:43:48.900 --> 01:43:49.110
Josh Moore: Only

655
01:43:50.790 --> 01:43:55.350
Bill Katz: There are lots of like landers and various other things. Now, which are available.

656
01:43:56.670 --> 01:44:07.080
Josh Moore: I don't know of anyone who's doing that, but that doesn't mean it's not possible. My, my knee jerk reaction is I have security concerns that sounds actually quite scary.

657
01:44:07.530 --> 01:44:18.600
Josh Moore: Um, but yeah i mean i the security issues can be handled with enough complexity with enough, you know, infrastructure. So that's something to think about.

658
01:44:21.000 --> 01:44:23.160
Josh Moore: Did anyone else have anything to kind of close up.

659
01:44:26.370 --> 01:44:27.450
Josh Moore: Okay, I'm gonna write down

660
01:44:29.550 --> 01:44:30.180
Like I said,

661
01:44:31.860 --> 01:44:34.080
Josh Moore: No URL as a service.

662
01:44:38.940 --> 01:44:40.140
Josh Moore: Okay, so

663
01:44:41.640 --> 01:44:43.320
Josh Moore: If we're kind of in finishing up mode.

664
01:44:45.660 --> 01:44:51.420
Josh Moore: There's certainly a sense that we should figure out next steps. I don't have them off.

665
01:44:52.320 --> 01:45:01.020
Josh Moore: The tip of my tongue. If anyone has anything that they're clear they either want to do this or they want to meet up with someone and have a conversation, feel free to say it now right into the document.

666
01:45:01.770 --> 01:45:10.200
Josh Moore: Otherwise, I will certainly write up a summary of some form and post it to image se or two that will me website. Um, and we keep the conversation going.

667
01:45:11.280 --> 01:45:13.440
Josh Moore: So does anyone have anything they want to do right away.

668
01:45:16.110 --> 01:45:16.740
Josh Moore: Thank you.

669
01:45:18.210 --> 01:45:31.530
Ola Tarkowska: So I have a proposal if it depends on how how extensive your notes are but because today. There were plenty of problem raised and quite a lot of

670
01:45:32.280 --> 01:45:42.510
Ola Tarkowska: Sick basically significant knowledge was shared and I feel there is quite a lot behind this as well. So I was wondering if people would be willing to write some

671
01:45:43.020 --> 01:45:55.440
Ola Tarkowska: Contribute to this specification. Like, for example, like gathering requirements like Moscow method is doing so provide what master would shoot

672
01:45:55.950 --> 01:46:08.310
Ola Tarkowska: What could be there or what you don't want to see based on your experience modalities various instruments you are using various analysis techniques and

673
01:46:09.810 --> 01:46:14.910
Ola Tarkowska: Is this something people would be interested to write down to contribute to

674
01:46:24.570 --> 01:46:27.660
Josh Moore: It from my side. Certainly, going back to I think where we started.

675
01:46:30.360 --> 01:46:32.700
Josh Moore: I don't have. I don't have the URL for you.

676
01:46:34.410 --> 01:46:43.710
Josh Moore: Welcome to take suggestions. But from our side. I think the output of everything we discussed needs to show up as issues. And I feel like with the two sessions we had

677
01:46:44.700 --> 01:46:56.250
Josh Moore: There's at least a good doesn't highlight, we need to think about. And then all I guess what you're proposing as a form of not voting on those but categorizing them in terms of

678
01:46:56.940 --> 01:46:58.410
Josh Moore: Priorities yeah

679
01:46:58.860 --> 01:47:16.650
Ola Tarkowska: Yeah, but this could help you have to decide what actually API should have and must have what's up show now and what shouldn't be there. Based on the use cases and experience because this was right. Since the beginning that we should operate on workflows.

680
01:47:19.530 --> 01:47:24.300
Ola Tarkowska: Well, it's up up to everyone. So it's just a suggestion. This could highlight

681
01:47:26.760 --> 01:47:28.680
Ola Tarkowska: Lots of good things.

682
01:47:32.760 --> 01:47:35.400
Josh Moore: Yeah, so I think something like that definitely needs to happen. I'm a bit

683
01:47:36.750 --> 01:47:43.710
Josh Moore: unclear what's the path forward to make it happen. I mean, we're all busy so that'll be the biggest trick I'm

684
01:47:45.570 --> 01:47:52.980
Josh Moore: Okay. So, kind of in that vein, so now I'm just going through the we have nine minutes. So the next step topics that I listed

685
01:47:55.230 --> 01:47:56.400
Josh Moore: I assume everyone's okay with

686
01:47:56.430 --> 01:47:59.160
Josh Moore: publishing the notes. They've been public

687
01:48:00.030 --> 01:48:02.460
Josh Moore: publishing the recordings. I'll get all that online.

688
01:48:04.740 --> 01:48:06.570
Josh Moore: Lots of thumbs up or thumbs down.

689
01:48:08.520 --> 01:48:12.030
Josh Moore: Is there a general feeling that it's worth doing this regularly.

690
01:48:13.170 --> 01:48:16.230
Josh Moore: So I asked this morning as well. So in roughly a month.

691
01:48:17.430 --> 01:48:22.260
Josh Moore: So I hope from all me side that will some more results will do another video or two.

692
01:48:23.970 --> 01:48:32.940
Josh Moore: I mean, I can just throw it out there and image se and whoever shows up shows up. So that's fine if there are particular discussions that someone wants to happen.

693
01:48:33.900 --> 01:48:43.410
Josh Moore: Not month period somehow say the word either now or in the document or an image se when we say, you know, when I opened up the registration for the next one.

694
01:48:44.820 --> 01:48:48.510
Josh Moore: Anyone's welcome to take control of this, you know, do what you need to do show something

695
01:48:50.520 --> 01:48:55.230
Josh Moore: Whatever it takes to really keep us moving forward. So basically moving forward like Ola. This is talking about.

696
01:48:56.400 --> 01:48:59.460
Josh Moore: That was the topics videos, um,

697
01:49:05.790 --> 01:49:14.820
Josh Moore: So from my side. I'll just say I I've experienced more and more that recording things helps with the communication, especially with all the time zones, we're dealing with

698
01:49:15.270 --> 01:49:19.170
Josh Moore: So that keeps me from needing to do the same presentation in the morning in the evening.

699
01:49:19.800 --> 01:49:29.970
Josh Moore: And I know there's a Nepali meeting which takes place for the Pacific in California and time zones. Um, I'll probably never make it to that. But, you know,

700
01:49:30.600 --> 01:49:36.570
Josh Moore: If I'm more than happy to send a video somewhere or you can send a video from some other time zone to this time zone. You know, it's kind of like

701
01:49:38.850 --> 01:49:48.090
Josh Moore: A TARDIS of some form of sharing information. Basically, we need new ways of working. So if anyone has ideas you know we're open to

702
01:49:48.900 --> 01:50:02.790
Josh Moore: Having everyone be involved, but having lots of meetings, obviously, is draining and time consuming. So what's the fine balance and what do you all want to see happen if you say everything should happen on GitHub. Then, you know, so be it. That's fine.

703
01:50:03.900 --> 01:50:07.920
Josh Moore: Last point for me. Um, is there any feedback on how

704
01:50:09.090 --> 01:50:16.410
Josh Moore: The setup of this meeting took place on image SC did it work for everyone, where the notifications annoying where they visible.

705
01:50:18.030 --> 01:50:19.170
Josh Moore: Would you change anything.

706
01:50:23.790 --> 01:50:29.850
Josh Moore: So the only so we did get the feedback this morning that having everyone say. Me too. Me too. Me too, me too.

707
01:50:30.630 --> 01:50:41.550
Josh Moore: Was kind of annoying. So I'll try to come up with something that prevents that for next time the top level session. And here's where I need a yes or no or actually, I'll just say I'm going to do this, but you're welcome to veto it.

708
01:50:42.600 --> 01:50:53.250
Josh Moore: Is to set up a group on image SC called N G, F, F, probably add all of you to it. And that way I don't have to add all of you to every thread, I can just add the group.

709
01:50:54.090 --> 01:51:04.620
Josh Moore: If you don't want to be in the group you're welcome to leave the group yourself or you could tell me some form that I shouldn't do it. But it's gonna make my life simpler, so I'm looking forward to that.

710
01:51:06.930 --> 01:51:07.560
Josh Moore: Cool.

711
01:51:08.580 --> 01:51:13.770
Josh Moore: Five minutes left floor is open. I can stick around for a bit. It's good to see everyone

712
01:51:18.750 --> 01:51:21.960
Josh Moore: Stay safe, healthy vote. I don't know.

713
01:51:24.120 --> 01:51:36.360
Jason Swedlow: Just I'll just, I'll just follow up. I mean, I put some comments in the chat. Clearly we're going forward with this we have we have funding to do this and you know if if solutions exist. Tell us.

714
01:51:38.580 --> 01:51:46.080
Jason Swedlow: You have specific requirements as we're discussing a few minutes ago around example metadata or whatever, say that as well. I mean,

715
01:51:48.150 --> 01:51:57.360
Jason Swedlow: It sounds like. I mean, the convergence shushes GitHub issues, but also these forums or me Jesse, you know, pick your flavor.

716
01:52:01.650 --> 01:52:02.040
Or

717
01:52:03.330 --> 01:52:07.560
Jason Swedlow: And it's lovely to see you guys and see you all see everybody reasonably healthy so

718
01:52:09.960 --> 01:52:13.110
Josh Moore: If someone adds a topic of a beer session for the next one, then we can

719
01:52:14.790 --> 01:52:15.600
Josh Moore: We can make that happen.

720
01:52:16.200 --> 01:52:17.550
Jason Swedlow: So it will become very popular.

721
01:52:19.140 --> 01:52:22.890
Josh Moore: It's an official work event right i mean you have to be allowed.

722
01:52:25.080 --> 01:52:31.620
Josh Moore: Cool. Then I mean, anyone who wants to disappear, feel free. I'm gonna hang out for a minute but. Take care everyone

723
01:52:33.150 --> 01:52:33.780
Mark Kittisopikul: Thank you.

724
01:52:34.830 --> 01:52:35.250
Ola Tarkowska: Hi.

725
01:52:37.800 --> 01:52:41.370
Eric Perlman: Josh, I don't know how you managed to host two of these. And one day.

726
01:52:44.700 --> 01:52:46.560
Eric Perlman: You have some superhumans guilds here.

727
01:52:47.490 --> 01:52:48.210
Josh Moore: I am

728
01:52:49.740 --> 01:53:01.680
Josh Moore: I don't think I really, I didn't prepare anything. So it's just kind of like, you know, I was kind of nervous, who is going to be absolute chaos, but everyone behaves so it's it's all fine, right, it could have been horrible, but it was it was lovely.

729
01:53:04.830 --> 01:53:09.720
Josh Moore: It's good talking to people who haven't been to a conference in February, good priests.

730
01:53:12.420 --> 01:53:13.380
Josh Moore: Are real conference.

731
01:53:14.400 --> 01:53:26.010
Ulrike Boehm: But also common to Jason like thank you for putting together this paper about formats and repositories to kick this off because also this is super crucial because I think we need also their

732
01:53:26.760 --> 01:53:35.820
Ulrike Boehm: guidelines and standards i don't know i i don't like to hear this word anymore. I have to say I used to so often already but but also with the repository part

733
01:53:36.210 --> 01:53:47.820
Ulrike Boehm: I really do have to say that although there are lots of repositories out there for image data. I really have to say they still also try to find themselves. That sounds a little bit esoteric right now but

734
01:53:48.810 --> 01:53:54.480
Ulrike Boehm: We had a talk at jamelia from someone who's also in charge of, I think, the image database.

735
01:53:55.320 --> 01:54:03.030
Ulrike Boehm: And she was only saying they still don't know how much data, they actually want to put in these repositories and so on and so forth. So

736
01:54:03.420 --> 01:54:11.160
Ulrike Boehm: If it's kind of only the data off the paper or everything. So this is also something where, where the entire community actually has to be

737
01:54:11.640 --> 01:54:18.630
Ulrike Boehm: Involved, because I think decisions are being made. But I think when decisions are made, I think the people who have the data, who produce the data.

738
01:54:19.230 --> 01:54:31.440
Ulrike Boehm: Should be part of these discussions and I mean I'm just which kind of supernatural right but I think sometimes people are over ambitious to get things done and then they sometimes might miss important

739
01:54:31.440 --> 01:54:33.510
Ulrike Boehm: Important. And just as a

740
01:54:36.810 --> 01:54:37.140
Jason Swedlow: Yes.

741
01:54:38.160 --> 01:54:42.420
Jason Swedlow: Yes. Oh, I'll all correct and

742
01:54:45.210 --> 01:54:45.570
Ulrike Boehm: Yeah.

743
01:54:45.990 --> 01:54:59.820
Jason Swedlow: Yeah, that that whole side of things, the data repository data resource side of things is probably it's just probably the shortest thing to say is it's early days, and we're going to stumble for a while and

744
01:55:01.260 --> 01:55:12.000
Jason Swedlow: I had right before locked out. I had the pleasure of having a drink with Helen Berman, who's been doing PDP for 50 years and, you know, she said, Yeah, you know,

745
01:55:12.210 --> 01:55:13.740
Josh Moore: Should I, should I turn off the recording.

746
01:55:14.640 --> 01:55:14.970
Probably

747
01:55:16.410 --> 01:55:17.220
Josh Moore: Turning off the recording.

748
01:55:20.820 --> 01:55:22.170
Jason Swedlow: Or live was yeah you know