Previous Entry Add to Memories Share Next Entry
Things I learned today about C#
Illuminati
andrewducker
The code I've been writing to convert a list into a comma-delimited string that consists of:
var duckerBrothers = new List<string> { "Andy", "Mike", "Hugh" };
var output = string.Empty;
foreach (var duckerBrother in duckerBrothers)
{
if (output != string.Empty)
{
output += ", ";
}

output += duckerBrother;
}

Can be rewritten as:
var duckerBrothers = new List<string> { "Andy", "Mike", "Hugh" };
var output = string.Join(",", duckerBrothers);

That's what I get for assuming that there is no built-in way of doing things.



Original post on Dreamwidth - there are comment count unavailable comments there.

Have you been working too long with languages that expect you to rebuild the wheel from grains of sand?

I'll say this for PHP, it has lots of helpful things. (Just they're all inconsistently named and parametrized...)

Yeah, I've read that. I agree with a lot of it (the bits I understand at least...). However, I'm coding for the web and PHP it is -- I just every now and then write a comment:

// WTF, PHP?

;)

Naah, just working with a language that has massive libraries, and not being familliar with all of them. After writing the code I thought "There must be a better way!" and Stack Overflow came to my rescue :->

Lol. I have dumped whole swathes of library code with each new version. We can't always pick up on everything. I presume you have also found Spilt?

Edited at 2013-01-15 03:35 pm (UTC)

That one I had, because it was an obvious thing to build, so I assumed it was there!

This is why I'm not too good with any language that has a very big library (such as C# and Java): making Join a method of String makes absolutely no sense to me. It is the List you're joining together (with a String as delimiter), not a String and a List.
When I become world dictator, I'll have it re-written. Just you wait!

string.Join only takes list of strings. If the method was on List then it would have to support lists of any object at all, and as the result is a string, this wouldn't be terribly useful.

In fact, earlier versions of string.Join only took arrays of strings - allowing any Ienumerable<string> is a modern addition.

Doesn't every class have a ToString-method?

I think everybody is better off with me not being an API- or library-designer. :P

Every class does - but it seems weird to expect that the code to convert a bunch of strings into a single string, using a user-defined operator, to be in "List". Especially as the string.join doesn't take a list - it takes an IEnumerable, and therefore works on arrays, linked lists, hashmaps, dictionaries, and any user-defined enumerable structure you care to create.

In C# 4 you could use an extension method on IEnumerable to do that - but the code long-predates that.

See my comment below. You want reduce, you do.

See my comment above: I really don't!

C# 3.5 onwards has map, Filter and reduce, but not by those names. The names are SQl-ish, i.e "Select", "Where" and "Aggregate".

It makes no sense because it's poor design.

Better to adopt a Scheme-like approach and use a higher order function (reduce) to apply a function that concatenates a pair of strings (with a delimiter) to a list of strings.

I can do that too (see below). But frankly 99% of the time writing:
var result = duckerBrothers.Aggregate((collected, next) => collected + ", " + next);
is more complex and harder to understand than:
var result = string.Join(",", duckerBrothers);

The latter is instantly clear as to what it does, and probably a lot faster than the former.

Yeah, that's the thing about C# code, you end up using a fair bit of functional small stuff *inside* object methods.
I'd wrap it up like

  public static string MyJoinAsStrings<T>(this IEnumerable<T> items)
  {
    return items
       .Select(x => x.ToString())
       .Aggregate((collected, next) => collected + ", " + next);
   }


But of course with the caveats that
1) There is something built in for the usual case where T is already string, and

2) joining strings with "+" is not suitable for long lists.



Edited at 2013-01-16 09:33 am (UTC)

The caveats are easily solved by:
        public static string MyJoinAsString(this IEnumerable items)
        {
            return string.Join(", ", items.Select(item => item.ToString()));
        }


And then you don't have to worry about long lists, because Join is optimised for that. Oh, and String.ToString() is just a "return this", which has negligible overhead, so I'm not worried enough about that to write a specific override.

Fair enough; dropping down to the non-generic base class is a trick that I am learning ... slowly. I did something similar with an IDictionary<T, U> recently; just treated it as an IDictionary

For the specific case of joining strings there's the library function here, but in general, the construct you had originally seems common enough that it should get some language support. I often find myself wanting to do "loop through all these things with some processing in between elements", and it seems you always have to do some clumsy construction (like your check against string.Empty).

Are there languages that have some nice syntax for that?

C# has very few language constructs - as much as possible is in a library.

I'm sure there are languages with lovely constructs for this kind of thing. People seem to like Python a lot.

Python actually has something very similar: 'join' as an instance method of a string.

As to the general case, the pattern known as 'fold' (or 'reduce' in python) in functional languages elegantly encapsulates it:

http://en.wikipedia.org/wiki/Fold_(higher-order_function)

Perl and Ruby also have join functions - in Perl's case it's a built-in function, and in Ruby's case it's a method on Array. There's a Haskell join function (which works on arbitrary lists) in Data.List.Utils, but as you point out it's easy enough to write one as a fold:
Prelude> foldr1 (\x y -> x++", "++y) ["fred", "barney", "betty"]
"fred, barney, betty"
Note that I'm using a right fold rather than a left fold; if I'd used foldl1, it would take quadratic time (as would Andy's hand-rolled C# version, assuming C# strings are wrappers around null-terminated C strings). This is why one should always use the built-in join operator in scripting languages - they typically run in linear time, whereas naive hand-rolled versions would run in quadratic time. The Haskell version of join in Data.List.Utils is defined as
join :: [a] -> [[a]] -> [a]
join delim l = concat (intersperse delim l)
using the intersperse operator defined in Data.List. Which I think should run in linear time, but I'm having a really hard time thinking about its time and space behaviour - there's a reason I'm not a Haskell programmer!

Edited at 2013-01-15 07:02 pm (UTC)

Yup, I can do that in C#:
var duckerBrothers = new List {"Andy", "Mike", "Hugh"};
var result = duckerBrothers.Aggregate((collected, next) => collected + ", " + next);

gives me "Andy, Mike, Hugh".

Oh, and C# strings aren't null-terminated - they're an immutable object that has a length property stored with it.

Yup, I can do that in C#:
var duckerBrothers = new List {"Andy", "Mike", "Hugh"};
var result = duckerBrothers.Aggregate((collected, next) => collected + ", " + next);

gives me "Andy, Mike, Hugh".

There are at least 2 other ways of doing that; the LINQ/functional way is much discussed in other comments. apparently it's important to be able to do that instead.

And, your first method is just crying out to be re-written using a StringBuilder for cases where the list can be long and perf matters.

Edited at 2013-01-15 10:56 pm (UTC)

Yeah, if I had a couple of dozen brothers then I'd use a Stringbuilder :->

i can't get past my reaction of "ugh. i'm not a fucking contortionist." every time i see OO code.
and yes, i live in PHP or occasionally shell scripts. :)

What makes it feel like contortion to you?

- the static typing - i'd like optional static typing sometimes for safety, but i like PHP's everything's a variable behaviour, even if the equivalence rules are screwy.
- the convoluted ways of having arrays/lists/collections/etc of things and them all being subtly different and not interchangeable, as opposed to PHP's powerful arrays you can bung anything into.
- little things like above string.empty and string.join
- i can never get my head round how to represent things i'm doing as a set of classes, or if i can it seems really contrived (like having classes purely to serve OO mentality rather than representing anything meaningful) and much harder to read/debug and/or less efficient than "my way".

Dynamic typing is great for small projects, as soon as you're talking about lots of interlocking system you want to know that what you've just been passed is _definitely_ the thing you were expecting to be passed, and that none of the other developers have changed things and broken them in a trivial way.

I can have an array of objects if I want, and put anything I want into it. But then whoever is writing that part of the application might put things into it I wasn't expecting, and then we'd be debugging all sorts of awfulness.

I don't have to use string.empty - I can just use "". But I've gotten into the habit, because it avoids creating a new string.

string.join is just a function. It's got to go _somewhere_ and there are millions of functions in the base libraries, so organising them by the type of object you're working with makes sense. You can think of it like a filing system (which, in many ways, it is).

None of this is useful for a small language doing a limited number of things in a small project. It's only really useful when you're talking about large projects doing complex stuff.